diff --git a/CHANGES.md b/CHANGES.md index 13fb2e6..f9d6cfc 100644 --- a/CHANGES.md +++ b/CHANGES.md @@ -1,9 +1,19 @@ +## Composer alpha integration + +- Composer integration: `.runComposerAudio()`, `.runComposer()` (and raw `.converse()` + + `.event()`) + `actions` support +- Bumped API version to `20220801`. +- interactive now uses Composer for text inputs, use `!message` for `GET /message` and `!converse` for Composer audio inputs +- added pizza example + ## v6.4.0 + - Add `POST /synthesize` integration. - Add `POST /dictation` integration. - New example using `synthesize()` and `dictation()`. ## v6.3.0 + - `speech()` emits `partialUnderstanding` events to support live understanding. - `apiVersion` updated to `20220608` and its type is now a number. diff --git a/README.md b/README.md index 62a860f..ba1d3e3 100644 --- a/README.md +++ b/README.md @@ -28,8 +28,14 @@ See `examples/messenger.js` for a thoroughly documented tutorial. The Wit module provides a Wit class with the following methods: +- `runComposerAudio` - the [Composer](https://wit.ai/docs/recipes#composer) integration for voice; +- `runComposer` - the [Composer](https://wit.ai/docs/recipes#composer) integration for other inputs; +- `converse` - the Wit [converse](https://wit.ai/docs/http/#post__converse_link) API; +- `event` - the Wit [event](https://wit.ai/docs/http#post__event_link) API; - `message` - the Wit [message](https://wit.ai/docs/http#get__message_link) API; -- `speech` - the Wit [speech](https://wit.ai/docs/http#post__speech_link) API. +- `speech` - the Wit [speech](https://wit.ai/docs/http#post__speech_link) API; +- `dictation` - the Wit [dictation](https://wit.ai/docs/http#post__dictation_link) API; +- `synthetize` - the Wit [synthetize](https://wit.ai/docs/http#post__synthetize_link) API. You can also require a library function to test out your Wit app in the terminal. `require('node-wit').interactive` @@ -37,8 +43,9 @@ You can also require a library function to test out your Wit app in the terminal The Wit constructor takes the following parameters: -- `accessToken` - the access token of your Wit instance -- `logger` - (optional) the object handling the logging. +- `accessToken` - the access token of your Wit instance; +- `actions` - the object of [client action definitions for Composer](https://wit.ai/docs/recipes#run-custom-code); +- `logger` - (optional) the object handling the logging; - `apiVersion` - (optional) the API version to use instead of the recommended one The `logger` object should implement the methods `debug`, `info`, `warn` and `error`. @@ -50,14 +57,83 @@ Example: ```js const {Wit, log} = require('node-wit'); +const actions = { + confirm_order(contextMap) { + return {context_map: {...contextMap, order_confirmation: 'PIZZA42'}}; + }, +}; + const client = new Wit({ accessToken: MY_TOKEN, + actions, logger: new log.Logger(log.DEBUG), // optional }); console.log(client.message('set an alarm tomorrow at 7am')); ``` +## .runComposerAudio() + +The [Composer](https://wit.ai/docs/recipes#composer) integration for voice. + +Takes the following parameters: + +- `sessionId` - a unique string identifying the user session +- `contentType` - the Content-Type header +- `body` - the audio `Readable` stream +- `contextMap` - the [context map](https://wit.ai/docs/recipes#custom-context) object + +Emits `partialTranscription`, `response` and `fullTranscription` events. +Run the provided `actions` as instructed by the API response, and calls back with the resulting updated context map (unless the action returns `stop: true`). +The Promise returns the final JSON payload of the last API call ([POST /converse](https://wit.ai/docs/http#post__converse_link) or [POST +/event](https://wit.ai/docs/http#post__event_link)). + +See `lib/interactive.js` for an example. + +## .runComposer() + +The [Composer](https://wit.ai/docs/recipes#composer) integration for other +inputs, including text. + +Takes the following parameters: + +- `sessionId` - a unique string identifying the user session +- `contextMap` - the [context map](https://wit.ai/docs/recipes#custom-context) object +- `message` - the optional user text query + +Emits `response` events. +Run the provided `actions` as instructed by the API response, and calls back with the resulting updated context map (unless the action returns `stop: true`). +The Promise returns the final JSON payload of the last [POST /event](https://wit.ai/docs/http#post__event_link) API call. + +See `lib/interactive.js` for an example. + +## .converse() + +The Wit [converse](https://wit.ai/docs/http/#post__converse_link) API. + +Takes the following parameters: + +- `sessionId` - a unique string identifying the user session +- `contentType` - the Content-Type header +- `body` - the audio `Readable` stream +- `contextMap` - the [context map](https://wit.ai/docs/recipes#custom-context) object + +Emits `partialTranscription` and `fullTranscription` events. + +We recommend to use `.runComposerAudio()` instead of this raw API. + +## .event() + +The Wit [event](https://wit.ai/docs/http#post__event_link) API. + +Takes the following parameters: + +- `sessionId` - a unique string identifying the user session +- `contextMap` - the [context map](https://wit.ai/docs/recipes#custom-context) object +- `message` - the optional user text query + +We recommend to use `.runComposer()` instead of this raw API. + ### .message() The Wit [message](https://wit.ai/docs/http/#get__message_link) API. @@ -80,6 +156,8 @@ client .catch(console.error); ``` +See `lib/interactive.js` for another example integration. + ### .speech() The Wit [speech](https://wit.ai/docs/http#post__speech_link) API. @@ -130,7 +208,14 @@ See `examples/synthesize-speech.js` for an example. ### interactive Starts an interactive conversation with your Wit app. -Use `!speech` to send an audio request from the microphone, or enter any text input. + +Full conversational interactions: +Use `!converse` to send an audio request from the microphone using Composer. +Enter any text input to send a text request using Composer. + +One-off natural language requests: +Use `!speech` to send an audio request from the microphone. +Use `!message ` to send a text request. Example: @@ -146,9 +231,7 @@ See the [docs](https://wit.ai/docs) for more information. The default (recommended, latest) API version is set in `config.js`. On May 13th, 2020, the `GET /message` API was updated to reflect the new data model: intents, traits and entities are now distinct. -We updated the SDK to the latest version: `20200513`. -You can target a specific version by passing the `apiVersion` parameter when -creating the `Wit` object. +You can target a specific version by passing the `apiVersion` parameter when creating the `Wit` object. ```json { diff --git a/examples/pizza.js b/examples/pizza.js new file mode 100644 index 0000000..501484d --- /dev/null +++ b/examples/pizza.js @@ -0,0 +1,68 @@ +/** + * Copyright (c) Meta Platforms, Inc. and its affiliates. All Rights Reserved. + */ + +'use strict'; + +let Wit = null; +let interactive = null; +try { + // if running from repo + Wit = require('../').Wit; + interactive = require('../').interactive; +} catch (e) { + Wit = require('node-wit').Wit; + interactive = require('node-wit').interactive; +} + +const accessToken = (() => { + if (process.argv.length !== 3) { + console.log('usage: node examples/pizza.js '); + process.exit(1); + } + return process.argv[2]; +})(); + +const actions = { + process_order(contextMap) { + const {order} = contextMap; + if (typeof order !== 'object') { + console.log('could not find order'); + return {context_map: contextMap}; + } + + const pizze = Array.from(order.pizze || []); + const pizze_number = pizze.length; + if (pizze_number < 1) { + console.log('could not find any pizze in the order'); + return {context_map: contextMap}; + } + + const processed = pizze.length; + const order_number = pizze[0].type.substring(0, 3).toUpperCase() + '-42X6'; + + return {context_map: {...contextMap, pizze_number, order_number}}; + }, + make_summary(contextMap) { + const {order} = contextMap; + if (typeof order !== 'object') { + console.log('could not find order'); + return {context_map: contextMap}; + } + + const pizze = Array.from(order.pizze || []); + if (pizze.length < 1) { + console.log('could not find any pizze in the order'); + return {context_map: contextMap}; + } + + const order_summary = pizze + .map(({size, type}) => 'a ' + size + ' ' + type) + .join(', '); + + return {context_map: {...contextMap, order_summary}, stop: true}; + }, +}; + +const client = new Wit({accessToken, actions}); +interactive(client); diff --git a/lib/config.js b/lib/config.js index ddc9781..e4074a2 100644 --- a/lib/config.js +++ b/lib/config.js @@ -3,6 +3,6 @@ */ module.exports = { - DEFAULT_API_VERSION: 20220608, + DEFAULT_API_VERSION: 20220801, DEFAULT_WIT_URL: 'https://api.wit.ai', }; diff --git a/lib/interactive.js b/lib/interactive.js index 03e14cf..3cdfdbe 100644 --- a/lib/interactive.js +++ b/lib/interactive.js @@ -7,11 +7,17 @@ const fs = require('fs'); const mic = require('mic'); const readline = require('readline'); +const uuid = require('uuid'); + +const sessionId = uuid.v4(); const AUDIO_PATH = '/tmp/output.raw'; const MIC_TIMEOUT_MS = 3000; +const MSG_PREFIX_COMMAND = '!message'; + +module.exports = (wit, handleResponse, initContextMap) => { + let contextMap = typeof initContextMap === 'object' ? initContextMap : {}; -module.exports = (wit, handleResponse, context) => { const rl = readline.createInterface({ input: process.stdin, output: process.stdout, @@ -25,12 +31,43 @@ module.exports = (wit, handleResponse, context) => { prompt(); const makeResponseHandler = rsp => { + const {context_map} = rsp; + if (typeof context_map === 'object') { + contextMap = context_map; + } + if (handleResponse) { handleResponse(rsp); } else { console.log(JSON.stringify(rsp)); } - prompt(); + + return rsp; + }; + + const openMic = onComplete => { + const microphone = mic({ + bitwidth: '16', + channels: '1', + encoding: 'signed-integer', + endian: 'little', + fileType: 'raw', + rate: '16000', + }); + + const inputAudioStream = microphone.getAudioStream(); + const outputFileStream = fs.WriteStream(AUDIO_PATH); + inputAudioStream.pipe(outputFileStream); + + inputAudioStream.on('startComplete', () => { + setTimeout(() => { + microphone.stop(); + }, MIC_TIMEOUT_MS); + }); + inputAudioStream.on('stopComplete', () => onComplete()); + + microphone.start(); + console.log('🎤 Listening...'); }; wit.on('partialTranscription', text => { @@ -42,6 +79,9 @@ module.exports = (wit, handleResponse, context) => { wit.on('partialUnderstanding', rsp => { console.log('Live understanding: ' + JSON.stringify(rsp)); }); + wit.on('response', ({text}) => { + console.log('< ' + text); + }); rl.on('line', line => { line = line.trim(); @@ -49,45 +89,61 @@ module.exports = (wit, handleResponse, context) => { return prompt(); } + // POST /converse + if (line === '!converse') { + const onComplete = () => { + const stream = fs.ReadStream(AUDIO_PATH); + wit + .runComposerAudio( + sessionId, + 'audio/raw;encoding=signed-integer;bits=16;rate=16000;endian=little', + stream, + contextMap, + ) + .then(makeResponseHandler) + .then(({expects_input}) => { + if (expects_input) { + openMic(onComplete); + } else { + prompt(); + } + }) + .catch(console.error); + }; + + return openMic(onComplete); + } + // POST /speech if (line === '!speech') { - const microphone = mic({ - bitwidth: '16', - channels: '1', - encoding: 'signed-integer', - endian: 'little', - fileType: 'raw', - rate: '16000', - }); - - const inputAudioStream = microphone.getAudioStream(); - const outputFileStream = fs.WriteStream(AUDIO_PATH); - inputAudioStream.pipe(outputFileStream); - - inputAudioStream.on('startComplete', () => { - setTimeout(() => { - microphone.stop(); - }, MIC_TIMEOUT_MS); - }); - inputAudioStream.on('stopComplete', () => { + const onComplete = () => { const stream = fs.ReadStream(AUDIO_PATH); wit .speech( 'audio/raw;encoding=signed-integer;bits=16;rate=16000;endian=little', stream, - context, ) .then(makeResponseHandler) + .then(prompt) .catch(console.error); - }); - - microphone.start(); - console.log('🎤 Listening...'); + }; + return openMic(onComplete); + } - return; + if (line.startsWith(MSG_PREFIX_COMMAND)) { + // GET /message + return wit + .message(line.slice(MSG_PREFIX_COMMAND.length)) + .then(makeResponseHandler) + .then(prompt) + .catch(console.error); } - // GET /message - wit.message(line, context).then(makeResponseHandler).catch(console.error); + // POST /event + wit + .runComposer(sessionId, contextMap, line) + .then(makeResponseHandler) + .then(prompt) + .catch(console.error); }); }; diff --git a/lib/wit.js b/lib/wit.js index 5e808b2..b7ac424 100644 --- a/lib/wit.js +++ b/lib/wit.js @@ -20,6 +20,133 @@ class Wit extends EventEmitter { this.config = Object.freeze(validate(opts)); } + runComposer(sessionId, contextMap, message) { + return this.event(sessionId, contextMap, message).then( + this.makeComposerHandler(sessionId), + ); + } + + runComposerAudio(sessionId, contentType, body, contextMap) { + return this.converse(sessionId, contentType, body, contextMap).then( + this.makeComposerHandler(sessionId), + ); + } + + converse(sessionId, contentType, body, contextMap) { + if (typeof sessionId !== 'string') { + throw new Error('Please provide a session ID (string).'); + } + + if (typeof contentType !== 'string') { + throw new Error('Please provide a content-type (string).'); + } + + if (!body instanceof Readable) { + throw new Error('Please provide an audio stream (Readable).'); + } + + const {apiVersion, headers, logger, proxy, witURL} = this.config; + + const params = { + session_id: sessionId, + v: apiVersion, + }; + + if (typeof contextMap === 'object') { + params.context_map = JSON.stringify(contextMap); + } + + const method = 'POST'; + const fullURL = witURL + '/converse?' + encodeURIParams(params); + logger.debug(method, fullURL); + + const req = fetch(fullURL, { + body, + method, + proxy, + headers: { + ...headers, + 'Content-Type': contentType, + 'Transfer-Encoding': 'chunked', + }, + }); + + const _partialResponses = req + .then( + response => + new Promise((resolve, reject) => { + logger.debug('status', response.status); + const bodyStream = response.body; + + bodyStream.on('readable', () => { + let chunk; + let contents = ''; + while (null !== (chunk = bodyStream.read())) { + contents += chunk.toString(); + } + + for (const rsp of parseResponse(contents)) { + const {error, is_final, text} = rsp; + + // Live transcription + if (!(error || is_final)) { + logger.debug('[converse] partialTranscription:', text); + this.emit('partialTranscription', text); + } + } + }); + }), + ) + .catch(e => + logger.error('[converse] could not parse partial response', e), + ); + + return req + .then(response => Promise.all([response.text(), response.status])) + .then(([contents, status]) => { + const finalResponse = parseResponse(contents).pop(); + const {text} = finalResponse; + + logger.debug('[converse] fullTranscription:', text); + this.emit('fullTranscription', text); + + return [finalResponse, status]; + }) + .catch(e => e) + .then(makeWitResponseHandler(logger, 'converse')); + } + + event(sessionId, contextMap, message) { + if (typeof sessionId !== 'string') { + throw new Error('Please provide a session ID (string).'); + } + + const {apiVersion, headers, logger, proxy, witURL} = this.config; + + const params = { + session_id: sessionId, + v: apiVersion, + }; + + if (typeof contextMap === 'object') { + params.context_map = JSON.stringify(contextMap); + } + + const body = {}; + if (typeof message === 'string') { + body.type = 'message'; + body.message = message; + } + + const method = 'POST'; + const fullURL = witURL + '/event?' + encodeURIParams(params); + logger.debug(method, fullURL); + + return fetch(fullURL, {body: JSON.stringify(body), method, headers, proxy}) + .then(response => Promise.all([response.json(), response.status])) + .then(makeWitResponseHandler(logger, 'event')); + } + message(q, context, n) { if (typeof q !== 'string') { throw new Error('Please provide a text input (string).'); @@ -100,6 +227,7 @@ class Wit extends EventEmitter { new Promise((resolve, reject) => { logger.debug('status', response.status); const bodyStream = response.body; + bodyStream.on('readable', () => { let chunk; let contents = ''; @@ -189,12 +317,15 @@ class Wit extends EventEmitter { const {error, is_final, text} = rsp; // Live transcription - if (!(error)) { + if (!error) { if (!is_final) { logger.debug('[dictation] partial transcription:', text); this.emit('partialTranscription', text); } else { - logger.debug('[dictation] full sentence transcription:', text); + logger.debug( + '[dictation] full sentence transcription:', + text, + ); this.emit('fullTranscription', text); } } @@ -202,7 +333,9 @@ class Wit extends EventEmitter { }); }), ) - .catch(e => logger.error('[dictation] could not parse partial response', e)); + .catch(e => + logger.error('[dictation] could not parse partial response', e), + ); return req .then(response => Promise.all([response.text(), response.status])) @@ -219,7 +352,14 @@ class Wit extends EventEmitter { .then(makeWitResponseHandler(logger, 'dictation')); } - synthesize(q, voice, style = "default", speed = 100, pitch = 100, gain = 100) { + synthesize( + q, + voice, + style = 'default', + speed = 100, + pitch = 100, + gain = 100, + ) { if (typeof q !== 'string') { throw new Error('Please provide a text input (string).'); } @@ -240,7 +380,7 @@ class Wit extends EventEmitter { speed: speed, pitch: pitch, gain: gain, - } + }; const method = 'POST'; const fullURL = witURL + '/synthesize?' + encodeURIParams(params); @@ -258,8 +398,45 @@ class Wit extends EventEmitter { .then(response => Promise.all([response, response.status])) .then(makeWitResponseHandler(logger, 'synthesize')); } + + makeComposerHandler(sessionId) { + const {actions, logger} = this.config; + + return ({context_map, action, expects_input, response}) => { + if (typeof context_map !== 'object') { + throw new Error( + 'Unexpected context_map in API response: ' + + JSON.stringify(context_map), + ); + } + + if (response) { + logger.debug('[composer] response:', response); + this.emit('response', response); + } + + if (action) { + logger.debug('[composer] got action', action); + return runAction(logger, actions, action, context_map).then( + ({context_map, stop}) => { + if (expects_input && !stop) { + return this.runComposer(sessionId, context_map); + } + return {context_map}; + }, + ); + } + + return {context_map, expects_input}; + }; + } } +const runAction = (logger, actions, name, ...rest) => { + logger.debug('Running action', name); + return Promise.resolve(actions[name](...rest)); +}; + const makeWitResponseHandler = (logger, endpoint) => rsp => { const error = e => { logger.error('[' + endpoint + '] Error: ' + e); @@ -344,6 +521,10 @@ const validate = opts => { opts.logger = opts.logger || new log.Logger(log.INFO); opts.proxy = getProxyAgent(opts.witURL); + if (opts.actions && typeof opts.actions !== 'object') { + throw new Error('Please provide actions mapping (string -> function).'); + } + return opts; };