Chat + SenseVoice + Qwen3-TTS Streaming

SenseVoice handles microphone transcription. Chat replies use streamed Qwen3-TTS PCM over /v1/chat/stream-tts.

SenseVoice WebSocket URL

Chat Streaming WebSocket URL

API Key

auto-send final transcript auto-play audio Idle ASR: disconnected Chat TTS: idle assistant: idle

Voice

Language

Temperature

Top P

Max Tokens

ASR Partials

VAD End Silence ms

VAD RMS Threshold

Optional extra system prompt

Audio Controls

Tune full-duplex behavior live. These settings are saved in your browser.

barge-in echo text filter

Barge-in sensitivity (RMS threshold)

0.060

Lower = more sensitive. Raise this if the assistant interrupts itself.

Barge-in hold time

450 ms

How long speech must persist before interrupting assistant playback.

Echo guard window

3500 ms

How long after playback ends to keep rejecting likely echo transcripts.

Echo similarity threshold

0.90

Higher = stricter echo rejection. Lower = more aggressive filtering.

Current mic RMS

0.0000

Near-end speech hold

0 ms

Last echo similarity

—

Playback state

idle

Conversation

Assistant audio starts as soon as PCM chunks arrive. Chat history is saved in localStorage.

Message

ASR Transcript / Reply Metadata

ASR Transcript

—

Current ASR Partial

—

Last Final ASR

—

Last Reply

—

Tool used

—

Emotion

—

Style

—

Tone

—

Accent

—

Persona

—

Event Log