Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
27ac0fc
feat(tts): add resolveVoice() and getServerVoiceList() utilities
wyuc Mar 21, 2026
eabaad2
feat(tts): add AudioIndicator equalizer bars component
wyuc Mar 21, 2026
9053b38
feat(tts): add onSegmentSealed callback to StreamBuffer
wyuc Mar 21, 2026
ea8c189
feat(tts): add voiceOverrides field to AgentConfig and AgentTemplate
wyuc Mar 21, 2026
984bdb3
feat(tts): add useDiscussionTTS hook with audio queue and cleanup
wyuc Mar 21, 2026
93e9542
feat(tts): add audio state indicator to Roundtable bubble
wyuc Mar 21, 2026
38ef134
feat(tts): wire onSegmentSealed callback through chat sessions
wyuc Mar 21, 2026
266e976
feat(tts): add per-agent voice dropdown to AgentBar
wyuc Mar 21, 2026
8e4e964
feat(tts): integrate useDiscussionTTS in Stage and pass state to Roun…
wyuc Mar 21, 2026
f0f0847
style(tts): refine voice dropdown to pill-style selector
wyuc Mar 21, 2026
606682a
style(tts): use shadcn Select for voice dropdown, link with TTS toggle
wyuc Mar 21, 2026
a2e5124
style(tts): add voice label prefix and always show dropdown
wyuc Mar 21, 2026
0ffdf49
style(tts): add volume icon hint in collapsed AgentBar
wyuc Mar 21, 2026
60abace
fix(tts): fix voice dropdown layout and click handling
wyuc Mar 21, 2026
e835c00
refactor(tts): redesign AgentBar voice layout for compactness
wyuc Mar 21, 2026
0b00493
feat(tts): cross-provider voice selection per agent
wyuc Mar 21, 2026
f6cc8b8
fix(tts): align role badge and voice pill across agent rows
wyuc Mar 22, 2026
cbad4df
fix(tts): fix role badge and voice pill alignment
wyuc Mar 22, 2026
0431589
style(tts): align role badge and voice pill across agent rows
wyuc Mar 22, 2026
e10d0b0
fix(tts): use fixed w-[60px] for role badge alignment
wyuc Mar 22, 2026
82bc273
fix(tts): use fixed w-[88px] for voice pill alignment
wyuc Mar 22, 2026
13cf0a2
fix(tts): prevent click-outside from closing AgentBar when Select por…
wyuc Mar 22, 2026
55bf944
fix(tts): comprehensive voice picker rewrite
wyuc Mar 22, 2026
b0b9ba3
fix(tts): align voice provider availability with toolbar logic
wyuc Mar 22, 2026
77b2b5b
fix(tts): fallback to first available provider when global provider h…
wyuc Mar 22, 2026
e29b901
refactor(tts): remove global provider fallback from voice resolution
wyuc Mar 22, 2026
cde2e7e
feat(tts): add browser native TTS voices to agent voice picker
wyuc Mar 22, 2026
36e3997
feat(tts): simplify toolbar TTS to on/off toggle, add disabled state
wyuc Mar 22, 2026
1c06e00
refactor(tts): simplify Settings TTS tab to toggle + provider config
wyuc Mar 22, 2026
27370b0
refactor(tts): simplify media popover TTS tab to toggle only
wyuc Mar 22, 2026
8c6357f
fix(tts): add voice config hint to media popover TTS tab
wyuc Mar 22, 2026
230894a
feat(tts): add per-voice preview button in voice picker
wyuc Mar 22, 2026
bcca83d
fix(tts): preview text follows course language instead of UI language
wyuc Mar 22, 2026
703886b
refactor(tts): redesign AgentBar expanded panel layout
wyuc Mar 22, 2026
c5aeb56
refactor(tts): merge max turns into teacher row
wyuc Mar 22, 2026
f8aa1be
refactor(tts): separate teacher row and max turns, use stepper UI
wyuc Mar 22, 2026
8bd865f
fix(tts): increase voice pill contrast in dark mode
wyuc Mar 22, 2026
1fd9adc
fix(tts): make max turns input editable, tighten panel padding
wyuc Mar 22, 2026
1bd31b6
fix(tts): restore shuffle animation in auto mode (compact version)
wyuc Mar 22, 2026
a559514
fix(tts): adjust auto mode text spacing and add voice auto-assign hint
wyuc Mar 22, 2026
09469f0
fix(tts): auto-close voice popover after selecting a voice
wyuc Mar 22, 2026
ad8edb7
fix(tts): increase auto mode vertical padding for better balance
wyuc Mar 22, 2026
151afbf
fix(tts): push auto mode text toward bottom with flex spacer
wyuc Mar 22, 2026
55408f0
fix(tts): reduce auto mode bottom padding
wyuc Mar 22, 2026
cbc45ea
feat(tts): wait for TTS audio to finish before next agent turn
wyuc Mar 22, 2026
b5dc015
fix(tts): keep bubble visible while TTS audio is still playing
wyuc Mar 22, 2026
16cb71a
feat(tts): hold discussion bubble until TTS audio finishes
wyuc Mar 22, 2026
41c2e99
fix(tts): fix bubble hold - guard onStopSession instead of onLiveSpeech
wyuc Mar 22, 2026
717c80e
fix(tts): guard BOTH onLiveSpeech and onStopSession for bubble hold
wyuc Mar 22, 2026
963b2f2
feat(tts): hold bubble during TTS playback and respect playback speed
wyuc Mar 22, 2026
759bf11
feat(tts): LLM picks voice matching agent persona during generation
wyuc Mar 22, 2026
5e0eac9
fix(tts): restore volume slider in classroom toolbar
wyuc Mar 22, 2026
0e61845
fix(tts): teacher uses global lecture voice in discussion when no voi…
wyuc Mar 22, 2026
db18945
fix(tts): teacher always uses global lecture voice, no overrides
wyuc Mar 22, 2026
880190e
fix(tts): sync playback speed to currently playing audio in real-time
wyuc Mar 22, 2026
3491f1b
fix(tts): address code review issues
wyuc Mar 22, 2026
07fefd2
fix(tts): address PR review — abort preview fetch, defer error recovery
wyuc Mar 23, 2026
0e756f4
fix(tts): restore teacher voice pill, respect voiceConfig override
wyuc Mar 23, 2026
f489d5f
fix(tts): sync volume and mute to discussion TTS audio in real-time
wyuc Mar 23, 2026
0dfe405
Merge branch 'main' into feat/discussion-tts
wyuc Mar 23, 2026
6f84b5f
fix(tts): allow browser-native TTS alongside server providers
wyuc Mar 23, 2026
7292dc7
fix(tts): remove top padding from voice popover content
wyuc Mar 23, 2026
460cbf2
fix(tts): make selectedAgents reactive to voiceConfig changes
wyuc Mar 23, 2026
88fae23
fix(tts): use agents record instead of listAgents() to avoid infinite…
wyuc Mar 23, 2026
9b2f91c
fix(tts): single source of truth for teacher voice
wyuc Mar 23, 2026
a451872
Merge branch 'main' into feat/discussion-tts
cosarah Mar 23, 2026
2ecbe8c
feat: add avatar descriptions for smarter LLM avatar selection
wyuc Mar 23, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 59 additions & 13 deletions app/api/generate/agent-profiles/route.ts
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,8 @@ interface RequestBody {
sceneOutlines?: { title: string; description?: string }[];
language: string;
availableAvatars: string[];
avatarDescriptions?: Array<{ path: string; desc: string }>;
availableVoices?: Array<{ providerId: string; voiceId: string; voiceName: string }>;
}

function stripCodeFences(text: string): string {
Expand All @@ -50,7 +52,14 @@ function stripCodeFences(text: string): string {
export async function POST(req: NextRequest) {
try {
const body = (await req.json()) as RequestBody;
const { stageInfo, sceneOutlines, language, availableAvatars } = body;
const {
stageInfo,
sceneOutlines,
language,
availableAvatars,
avatarDescriptions,
availableVoices,
} = body;

// ── Validate required fields ──
if (!stageInfo?.name) {
Expand Down Expand Up @@ -79,6 +88,27 @@ export async function POST(req: NextRequest) {

const systemPrompt = `You are an expert instructional designer. Generate agent profiles for a multi-agent classroom simulation. Decide the appropriate number of agents (typically 3-5) based on the course content and complexity. Return ONLY valid JSON, no markdown or explanation.`;

// Build voice list for prompt (if available)
const voiceListStr =
availableVoices && availableVoices.length > 0
? JSON.stringify(
availableVoices.map((v) => ({
id: `${v.providerId}::${v.voiceId}`,
name: v.voiceName,
})),
)
: '';

const voicePrompt = voiceListStr
? `- Each agent should be assigned a voice that matches their persona from this list: ${voiceListStr}
- Pick a voice that suits the agent's personality and role (e.g. authoritative voice for teacher, lively voice for energetic student)
- Try to use different voices for each agent`
: '';

const voiceJsonField = voiceListStr
? ',\n "voice": "string (voice id from available list, e.g. \'qwen-tts::Cherry\')"'
: '';

const userPrompt = `Generate agent profiles for the following course:

Course name: ${stageInfo.name}
Expand All @@ -90,10 +120,13 @@ Requirements:
- Priority values: teacher=10 (highest), assistant=7, student=4-6
- Each agent needs: name, role, persona (2-3 sentences describing personality and teaching/learning style)
- Names and personas must be in language: ${language}
- Each agent must be assigned one avatar from this list: ${JSON.stringify(availableAvatars)}
- Each agent must be assigned one avatar from this list: ${JSON.stringify(avatarDescriptions && avatarDescriptions.length > 0 ? avatarDescriptions.map((a) => ({ path: a.path, description: a.desc })) : availableAvatars)}
- Pick an avatar that visually matches the agent's personality and role
- Try to use different avatars for each agent
- Use the "path" value as the avatar field in the output
- Each agent must be assigned one color from this list: ${JSON.stringify(COLOR_PALETTE)}
- Each agent must have a different color
${voicePrompt}

Return a JSON object with this exact structure:
{
Expand All @@ -104,7 +137,7 @@ Return a JSON object with this exact structure:
"persona": "string (2-3 sentences)",
"avatar": "string (from available list)",
"color": "string (hex color from palette)",
"priority": number (10 for teacher, 7 for assistant, 4-6 for student)
"priority": number (10 for teacher, 7 for assistant, 4-6 for student)${voiceJsonField}
}
]
}`;
Expand All @@ -130,6 +163,7 @@ Return a JSON object with this exact structure:
avatar: string;
color: string;
priority: number;
voice?: string;
}>;
};

Expand Down Expand Up @@ -161,16 +195,28 @@ Return a JSON object with this exact structure:
}

// ── Build output with IDs ──
const agents = parsed.agents.map((agent, index) => ({
id: `gen-${nanoid(8)}`,
name: agent.name,
role: agent.role,
persona: agent.persona,
avatar: agent.avatar || availableAvatars[index % availableAvatars.length],
color: agent.color || COLOR_PALETTE[index % COLOR_PALETTE.length],
priority:
agent.priority ?? (agent.role === 'teacher' ? 10 : agent.role === 'assistant' ? 7 : 5),
}));
const agents = parsed.agents.map((agent, index) => {
// Parse voice "providerId::voiceId" format
let voiceConfig: { providerId: string; voiceId: string } | undefined;
if (agent.voice && agent.voice.includes('::')) {
const [providerId, voiceId] = agent.voice.split('::');
if (providerId && voiceId) {
voiceConfig = { providerId, voiceId };
}
}

return {
id: `gen-${nanoid(8)}`,
name: agent.name,
role: agent.role,
persona: agent.persona,
avatar: agent.avatar || availableAvatars[index % availableAvatars.length],
color: agent.color || COLOR_PALETTE[index % COLOR_PALETTE.length],
priority:
agent.priority ?? (agent.role === 'teacher' ? 10 : agent.role === 'assistant' ? 7 : 5),
...(voiceConfig ? { voiceConfig } : {}),
};
});

log.info(`Successfully generated ${agents.length} agent profiles for "${stageInfo.name}"`);

Expand Down
76 changes: 63 additions & 13 deletions app/generation-preview/page.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ import { cn } from '@/lib/utils';
import { useStageStore } from '@/lib/store/stage';
import { useSettingsStore } from '@/lib/store/settings';
import { useAgentRegistry } from '@/lib/orchestration/registry/store';
import { getAvailableProvidersWithVoices } from '@/lib/audio/voice-resolver';
import { useI18n } from '@/lib/hooks/use-i18n';
import {
loadImageMapping,
Expand Down Expand Up @@ -379,28 +380,77 @@ function GenerationPreviewContent() {

try {
const allAvatars = [
'/avatars/assist.png',
'/avatars/assist-2.png',
'/avatars/clown.png',
'/avatars/clown-2.png',
'/avatars/curious.png',
'/avatars/curious-2.png',
'/avatars/note-taker.png',
'/avatars/note-taker-2.png',
'/avatars/teacher.png',
'/avatars/teacher-2.png',
'/avatars/thinker.png',
'/avatars/thinker-2.png',
{
path: '/avatars/teacher.png',
desc: 'Male teacher with glasses, holding a book, green background',
},
{
path: '/avatars/teacher-2.png',
desc: 'Female teacher with long dark hair, blue traditional outfit, gentle expression',
},
{
path: '/avatars/assist.png',
desc: 'Young female assistant with glasses, pink background, friendly smile',
},
{
path: '/avatars/assist-2.png',
desc: 'Young female in orange top and purple overalls, cheerful and approachable',
},
{
path: '/avatars/clown.png',
desc: 'Energetic girl with glasses pointing up, green shirt, lively and fun',
},
{
path: '/avatars/clown-2.png',
desc: 'Playful girl with curly hair doing rock gesture, blue shirt, humorous vibe',
},
{
path: '/avatars/curious.png',
desc: 'Surprised boy with glasses, hand on cheek, curious expression',
},
{
path: '/avatars/curious-2.png',
desc: 'Boy with backpack holding a book and question mark bubble, inquisitive',
},
{
path: '/avatars/note-taker.png',
desc: 'Studious boy with glasses, blue shirt, calm and organized',
},
{
path: '/avatars/note-taker-2.png',
desc: 'Active boy with yellow backpack waving, blue outfit, enthusiastic learner',
},
{
path: '/avatars/thinker.png',
desc: 'Thoughtful girl with hand on chin, purple background, contemplative',
},
{
path: '/avatars/thinker-2.png',
desc: 'Girl reading a book intently, long dark hair, intellectual and focused',
},
];

const getAvailableVoicesForGeneration = () => {
const providers = getAvailableProvidersWithVoices(settings.ttsProvidersConfig);
return providers.flatMap((p) =>
p.voices.map((v) => ({
providerId: p.providerId,
voiceId: v.id,
voiceName: v.name,
})),
);
};

// No outlines yet — agent generation uses only stage name + description
const agentResp = await fetch('/api/generate/agent-profiles', {
method: 'POST',
headers: getApiHeaders(),
body: JSON.stringify({
stageInfo: { name: stage.name, description: stage.description },
language: currentSession.requirements.language || 'zh-CN',
availableAvatars: allAvatars,
availableAvatars: allAvatars.map((a) => a.path),
avatarDescriptions: allAvatars.map((a) => ({ path: a.path, desc: a.desc })),
availableVoices: getAvailableVoicesForGeneration(),
}),
signal,
});
Expand Down
Loading
Loading