Skip to main content

transcription_session.updated

Server acknowledgement that transcription_session.update was received and the session was configured successfully.

After this event, the server can accept other client events such as input_audio_buffer.append.

event_idstringrequired

Server-generated unique event identifier

typestringrequired

Fixed value

Possible values: [transcription_session.updated]

session object

Realtime API session metadata (audio format, language, etc.).

input_audio_formatstring

Audio format. For all formats, audio must be 1 channel (mono).

For example, pcm16 is 16-bit PCM audio.

For twilio, you can send audio using the Twilio Media Streams API audio format as-is.

Possible values: [pcm16, float32, twilio]

Default value: pcm16
input_audio_sample_rateinteger

Input audio sample rate

Default value: 24000
input_audio_number_of_channelsinteger

Input audio number of channels

Default value: 1
input_audio_transcription objectrequired

Settings for transcribing input audio.

languagestringnullable

Specify the input audio language using an ISO-639-1 code.

Optional, but specifying it may improve transcription accuracy and/or latency.

Possible values: [en, ja]

target_languagestringrequired

Specify the transcript language using an ISO-639-1 code.

Possible values: [en, ja]

turn_detectionboolean

Whether to enable turn detection. If false, the conversation.item.input_audio_transcription.completed event is not sent.

Default value: false
objectstring

Fixed value

Possible values: [realtime.transcription_session]

transcription_session.updated
{
"event_id": "string",
"type": "transcription_session.updated",
"session": {
"input_audio_format": "pcm16",
"input_audio_sample_rate": 24000,
"input_audio_number_of_channels": 1,
"input_audio_transcription": {
"language": "en",
"target_language": "en"
},
"turn_detection": false,
"object": "realtime.transcription_session"
}
}