Stream AI content to the world
Before running the application, make sure you have the following prerequisites installed:
- Node.js (version 18.16 or higher) (run
nvm useif you use NVM or similar tools) - Twitch account with stream key for streaming
- OBS > 28.0.0 because it includes the obs-websocket plugin
- Clone the repo to your computer:
git clone https://github.com/failfa-st/strai.git
- Go into the strai folder
cd strai
- Install dependencies
npm i
- Create the .env based on .env.example
cp .env.example .env
- Fill out the .env
# You get this password in the next step "OBS Setup"
OBS_WEBSOCKET_SERVER_PASSWORD=
# The full path to the video that gets looped as a default when nothing is visible
OBS_DEFAULT_VIDEO=
You need to create the following base setup:
- A scene named
defaultwith an Sources > Media Source nameddefaultVideowith the following configuration- Local file: true
- Loop: true
- Restart playback when source becomes available: true
- Use hardware decoding when available: true
- Show nothing when playback ends: false
- Close file when inactive: false
- Leave all other settings on their defaults
- A scene named
queuewith - A scene named
streamwith- Sources > Group named
setup - Sources > Scene and select
defaultand put it into the groupsetup - Sources > Scene and select
queueand put it into the groupsetup - Make sure that the scenes inside the group are in this order:
queuedefault
- Sources > Group named
Then you can also configure the Websocket-Server:
- Tools > WebSocket Server Settings
- Set the server port to 4455, as this is the default here
- Enable Authentication: true
- Click on the "Show Connect Info" to get the Server Password
graph TB
A[Stable Diffusion WebUI] -->|Generates Picture| D[SadTalker]
B[Twitch Chat] -->|Extracts Messages| E{System Prompt to OpenAI GPT}
E -->|Generates Text| F[Bark]
F -->|Generates WAV File| D
D -->|Generates MP4 Video| G[OBS via obs-websocket-js]
G -->|Streams Video| H[Twitch]
This diagram describes the following steps:
Stable Diffusion WebUIgenerates a picture of a person which is used bySadTalker.- Messages are extracted from
Twitch Chatand transformed into an API call toOpenAI GPTusing strong system prompt that represents a specific persona. - The system prompt to
OpenAI GPTgenerates a response to the chat message which is fed toBark. Barkgenerates a WAV file based on the provided text.SadTalkercombines the picture fromStable Diffusion WebUIand the WAV file fromBarkto generate an MP4 video, which contains a face that speaks- The generated MP4 video is then input into
OBSusingobs-websocket-js. OBSstreams the video toTwitch.
- Enable the
defaultVideo- This makes sure that we have a loop constantly running when nothing is happening
- When
OBSRemoteControl.addVideois called:- Add the video to a video queue
- This creates a new Media Source inside of
queueusing the same transform settings that were used to position thedefaultVideo. It will also set the audio output to "Monitor and Output" so that the video's audio is hearable in the stream
- This creates a new Media Source inside of
- Add the video to a video queue
- The oldest video from the video queue will be played
- Once the video is playing in OBS, it will be removed from the video queue
- Once the video playback ended in OBS, the video will be removed from OBS