transcription_session.update
After a successful Realtime API connection, you must send this event before sending audio. It configures session metadata (audio format, language, etc.) for this session.
- You cannot change metadata mid-session. To change it, create a new connection.
- Only same-language transcription is supported (
language = target_language). turn_detection=trueis not yet supported.
Fixed value
Possible values: [transcription_session.update]
session objectrequired
Realtime API session metadata (audio format, language, etc.).
Audio format. For all formats, audio must be 1 channel (mono).
For example, pcm16 is 16-bit PCM audio.
For twilio, you can send audio using the Twilio Media Streams API audio format as-is.
Possible values: [pcm16, float32, twilio]
pcm16Input audio sample rate
24000Input audio number of channels
1input_audio_transcription objectrequired
Settings for transcribing input audio.
Specify the input audio language using an ISO-639-1 code.
Optional, but specifying it may improve transcription accuracy and/or latency.
Possible values: [en, ja]
Specify the transcript language using an ISO-639-1 code.
Possible values: [en, ja]
Whether to enable turn detection. If false, the conversation.item.input_audio_transcription.completed event is not sent.
false{
"type": "transcription_session.update",
"session": {
"input_audio_format": "pcm16",
"input_audio_sample_rate": 24000,
"input_audio_number_of_channels": 1,
"input_audio_transcription": {
"language": "en",
"target_language": "en"
},
"turn_detection": false
}
}