Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] WebRTC flag #49

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# ElevenLabs Monorepo for NPM Package

This repository contains multiple package published on npm under `@elevenlabs` scope.
This repository contains multiple package published on npm under `@elevenlabs` scope.
Separate packages can be found in the `packages` folder.

![LOGO](https://github.com/elevenlabs/elevenlabs-python/assets/12028621/21267d89-5e82-4e7e-9c81-caf30b237683)
Expand Down Expand Up @@ -42,23 +42,24 @@ pnpm link --global
pnpm link --global <pkg>
```

You can run `pnpm run dev` to automatically apply changes to your project.
You can run `pnpm run dev` to automatically apply changes to your project.
Note that many projects don't watch for changes inside of `node_modules` folder to rebuild.
You might have to restart the application, or modify you setup to watch for node_modules (possible development performance implications).

Also note that the above won't work with turbopack projects. Instead use webpack to develop locally.

Don't forget to run the `unlink` equivalent once you're done, to prevent confusion in the future.

## Creating New Package

You can always just add a new folder with package.json inside of `packages` folder.
You can always just add a new folder with package.json inside of `packages` folder.
Alternatively run `pnpm run create --name=[package-name]` in the root of this repository to create a new package from template.

## Publishing

To publish a package from the packages folder, create new GitHub release.
To publish a package from the packages folder, create new GitHub release.
Since there are multiple packages contained in this folder, the release name/tag should follow format `<package>@version`.
The release will trigger GitHub action publishing the package, and the tag will be used to publish specific package.
The release will trigger GitHub action publishing the package, and the tag will be used to publish specific package.

The GitHub action will only run the publish command. Make sure you've update the version number manually in `package.json`.
The GitHub action will only run the publish command. Make sure you've update the version number manually in `package.json`.

3 changes: 2 additions & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -23,5 +23,6 @@
"esbuild",
"msw"
]
}
},
"packageManager": "[email protected]+sha512.b2dc20e2fc72b3e18848459b37359a32064663e5627a51e4c74b2c29dd8e8e0491483c3abb40789cfd578bf362fb6ba8261b05f0387d76792ed6e23ea3b1b6a0"
}
3 changes: 3 additions & 0 deletions packages/client/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -44,5 +44,8 @@
"type": "git",
"url": "git+https://github.com/elevenlabs/packages.git",
"directory": "packages/client"
},
"dependencies": {
"livekit-client": "^2.9.5"
}
}
58 changes: 32 additions & 26 deletions packages/client/src/index.ts
Original file line number Diff line number Diff line change
@@ -1,18 +1,17 @@
import { arrayBufferToBase64, base64ToArrayBuffer } from "./utils/audio";
import { Input, InputConfig } from "./utils/input";
import { Input } from "./utils/input";
import type { InputConfig } from "./utils/input";
import { Output } from "./utils/output";
import {
Connection,
DisconnectionDetails,
OnDisconnectCallback,
SessionConfig,
} from "./utils/connection";
import { ClientToolCallEvent, IncomingSocketEvent } from "./utils/events";
import type { ClientToolCallEvent, IncomingSocketEvent } from "./utils/events";
import { isAndroidDevice, isIosDevice } from "./utils/compatibility";
import { WSSConnection } from "./utils/connection/wssConnection";
import type { SessionConfig, DisconnectionDetails, OnDisconnectCallback } from "./utils/connection/connection.interface";
import type { Connection } from "./utils/connection/connection";


export type { InputConfig } from "./utils/input";
export type { IncomingSocketEvent } from "./utils/events";
export type { SessionConfig, DisconnectionDetails, Language } from "./utils/connection";
export type { SessionConfig, DisconnectionDetails, Language, ConnectionType } from "./utils/connection/connection.interface";
export type Role = "user" | "ai";
export type Mode = "speaking" | "listening";
export type Status =
Expand All @@ -28,16 +27,16 @@ export type ClientToolsConfig = {
clientTools: Record<
string,
(
parameters: any
) => Promise<string | number | void> | string | number | void
parameters: unknown
) => Promise<string | number | null> | string | number | null
>;
};
export type Callbacks = {
onConnect: (props: { conversationId: string }) => void;
// internal debug events, not to be used
onDebug: (props: any) => void;
onDebug: (props: unknown) => void;
onDisconnect: OnDisconnectCallback;
onError: (message: string, context?: any) => void;
onError: (message: string, context?: unknown) => void;
onMessage: (props: { message: string; source: Role }) => void;
onModeChange: (prop: { mode: Mode }) => void;
onStatusChange: (prop: { status: Status }) => void;
Expand Down Expand Up @@ -105,7 +104,8 @@ export class Conversation {
await new Promise(resolve => setTimeout(resolve, delay));
}

connection = await Connection.create(options);
connection = await WSSConnection.create(options);

[input, output] = await Promise.all([
Input.create({
...connection.inputFormat,
Expand All @@ -114,29 +114,33 @@ export class Conversation {
Output.create(connection.outputFormat),
]);

preliminaryInputStream?.getTracks().forEach(track => track.stop());
for (const track of preliminaryInputStream.getTracks()) {
track.stop();
}
preliminaryInputStream = null;

return new Conversation(fullOptions, connection, input, output);
} catch (error) {
fullOptions.onStatusChange({ status: "disconnected" });
preliminaryInputStream?.getTracks().forEach(track => track.stop());
for (const track of preliminaryInputStream?.getTracks() ?? []) {
track.stop();
}
connection?.close();
await input?.close();
await output?.close();
throw error;
}
}

private lastInterruptTimestamp: number = 0;
private lastInterruptTimestamp = 0;
private mode: Mode = "listening";
private status: Status = "connecting";
private inputFrequencyData?: Uint8Array;
private outputFrequencyData?: Uint8Array;
private volume: number = 1;
private currentEventId: number = 1;
private lastFeedbackEventId: number = 1;
private canSendFeedback: boolean = false;
private volume = 1;
private currentEventId = 1;
private lastFeedbackEventId = 1;
private canSendFeedback = false;

private constructor(
private readonly options: Options,
Expand Down Expand Up @@ -229,7 +233,8 @@ export class Conversation {
case "client_tool_call": {
console.info("Received client tool call request", parsedEvent.client_tool_call);
if (
this.options.clientTools.hasOwnProperty(
Object.prototype.hasOwnProperty.call(
this.options.clientTools,
parsedEvent.client_tool_call.tool_name
)
) {
Expand All @@ -250,17 +255,18 @@ export class Conversation {
is_error: false,
});
} catch (e) {
const errorMessage = e instanceof Error ? e.message : String(e);
this.onError(
"Client tool execution failed with following error: " +
(e as Error)?.message,
`Client tool execution failed with following error: ${errorMessage}`,
{
clientToolName: parsedEvent.client_tool_call.tool_name,
}
);

this.connection.sendMessage({
type: "client_tool_result",
tool_call_id: parsedEvent.client_tool_call.tool_call_id,
result: "Client tool execution failed: " + (e as Error)?.message,
result: `Client tool execution failed: ${errorMessage}`,
is_error: true,
});
}
Expand Down Expand Up @@ -363,7 +369,7 @@ export class Conversation {
}, 2000); // Adjust the duration as needed
};

private onError = (message: string, context?: any) => {
private onError = (message: string, context?: unknown) => {
console.error(message, context);
this.options.onError(message, context);
};
Expand Down
104 changes: 70 additions & 34 deletions packages/client/src/utils/connection.ts
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
import {
import type {
InitiationClientDataEvent,
ConfigEvent,
isValidSocketEvent,
OutgoingSocketEvent,
IncomingSocketEvent,
} from "./events";
import { isValidSocketEvent } from "./events";
import { Room, RoomEvent } from "livekit-client";

const MAIN_PROTOCOL = "convai";

Expand Down Expand Up @@ -53,13 +54,14 @@ export type SessionConfig = {
voiceId?: string;
};
};
customLlmExtraBody?: any;
customLlmExtraBody?: unknown;
dynamicVariables?: Record<string, string | number | boolean>;
connectionDelay?: {
default: number;
android?: number;
ios?: number;
};
connectionType?: ConnectionType;
} & (
| { signedUrl: string; agentId?: undefined }
| { agentId: string; signedUrl?: undefined }
Expand All @@ -84,11 +86,34 @@ export type DisconnectionDetails =
export type OnDisconnectCallback = (details: DisconnectionDetails) => void;
export type OnMessageCallback = (event: IncomingSocketEvent) => void;

export enum ConnectionType {
WEBSOCKET = "websocket",
WEBRTC = "webrtc",
}

const WSS_API_ORIGIN = "wss://api.elevenlabs.io";
const WSS_API_PATHNAME = "/v1/convai/conversation?agent_id=";

const WEBRTC_TOKEN_API_ORIGIN = "http://localhost:3000";
const WEBRTC_TOKEN_PATHNAME = "/api/token";
const WEBRTC_API_ORIGIN = "wss://livekit.rtc.eleven2.dev"

export class Connection {
public static async create(config: SessionConfig): Promise<Connection> {
return config.connectionType === ConnectionType.WEBSOCKET
? Connection.createWebSocketConnection(config)
: Connection.createWebRTCConnection(config);
}

private static async createWebRTCConnection(config: SessionConfig): Promise<Connection> {



return await Connection.createWebSocketConnection(config);

}

private static async createWebSocketConnection(config: SessionConfig): Promise<Connection> {
let socket: WebSocket | null = null;

try {
Expand All @@ -105,46 +130,28 @@ export class Connection {
const conversationConfig = await new Promise<
ConfigEvent["conversation_initiation_metadata_event"]
>((resolve, reject) => {
socket!.addEventListener(
if (!socket) {
reject(new Error("Socket is not initialized"));
return;
}

socket.addEventListener(
"open",
() => {
const overridesEvent: InitiationClientDataEvent = {
type: "conversation_initiation_client_data",
};

if (config.overrides) {
overridesEvent.conversation_config_override = {
agent: {
prompt: config.overrides.agent?.prompt,
first_message: config.overrides.agent?.firstMessage,
language: config.overrides.agent?.language,
},
tts: {
voice_id: config.overrides.tts?.voiceId,
},
};
}
const overridesEvent = Connection.getOverridesEvent(config);

if (config.customLlmExtraBody) {
overridesEvent.custom_llm_extra_body = config.customLlmExtraBody;
}

if (config.dynamicVariables) {
overridesEvent.dynamic_variables = config.dynamicVariables;
}

socket?.send(JSON.stringify(overridesEvent));
socket!.send(JSON.stringify(overridesEvent));
},
{ once: true }
);
socket!.addEventListener("error", event => {
socket.addEventListener("error", event => {
// In case the error event is followed by a close event, we want the
// latter to be the one that rejects the promise as it contains more
// useful information.
setTimeout(() => reject(event), 0);
});
socket!.addEventListener("close", reject);
socket!.addEventListener(
socket.addEventListener("close", reject);
socket.addEventListener(
"message",
(event: MessageEvent) => {
const message = JSON.parse(event.data);
Expand Down Expand Up @@ -181,6 +188,35 @@ export class Connection {
}
}

private static getOverridesEvent(config: SessionConfig): InitiationClientDataEvent {
const overridesEvent: InitiationClientDataEvent = {
type: "conversation_initiation_client_data",
};

if (config.overrides) {
overridesEvent.conversation_config_override = {
agent: {
prompt: config.overrides.agent?.prompt,
first_message: config.overrides.agent?.firstMessage,
language: config.overrides.agent?.language,
},
tts: {
voice_id: config.overrides.tts?.voiceId,
},
};
}

if (config.customLlmExtraBody) {
overridesEvent.custom_llm_extra_body = config.customLlmExtraBody;
}

if (config.dynamicVariables) {
overridesEvent.dynamic_variables = config.dynamicVariables;
}

return overridesEvent;
}

private queue: IncomingSocketEvent[] = [];
private disconnectionDetails: DisconnectionDetails | null = null;
private onDisconnectCallback: OnDisconnectCallback | null = null;
Expand Down Expand Up @@ -272,8 +308,8 @@ function parseFormat(format: string): FormatConfig {
throw new Error(`Invalid format: ${format}`);
}

const sampleRate = parseInt(sampleRatePart);
if (isNaN(sampleRate)) {
const sampleRate = Number.parseInt(sampleRatePart);
if (Number.isNaN(sampleRate)) {
throw new Error(`Invalid sample rate: ${sampleRatePart}`);
}

Expand Down
Loading