Overview
The live transcripts dataset gives you access to live text transcripts from events. The dataset is updated in real-time as new transcripts become available.How it works
Our API provides support for live transcript streaming using JSON Lines (JSONL) format. When making a live transcript request, the API returns a URL that streams live transcript data in JSONL format. We continuously make upgrades to the format to improve the accuracy and usability of the transcripts. It is therefore important that unrecognized record types are ignored by clients as to be able to add more functionality in the future.Available versions
We currently support two versions of the file format:- 1.7 – The latest version. This release introduces refinement instructions. To request this version, set the
transcriptVersion=1.7query parameter. For more information, see the refinement instructions section below. - 1.6 – The default version returned by the API if no version is specified. If your client is already using version 1.6 and ignores unknown record types, it will remain compatible even when new record types are added in version 1.7.
File format
- start: This should be the first record in any file. All other keys are metadata for the stream and could be empty.
- entry: This is the default and most common record type and comes after the start. If no type is specified, it should be assumed that it is of this type. It contains the following:
- s: The start time in seconds, we prefer to do this as a number, but clients should be able to handle string.
- e: The end time in seconds, we prefer to do this as a number, but clients should be able to handle string.
- p: The phrase id (note that this is a string and does not have to be possible to decode as a decimal number). If passing words line-by-line, then several lines (words) can have the same phrase id so that phrases can be reconstructed in the front-end, if needed.
- t: The transcript text (in version ≥1.1 this will most often be a single word). If “[indiscernible]” is passed, it means that the transcription is not confident enough to show this word or phrase. Such records are usually sent with their own phrase id, p, in versions prior to 1.4, see below. During music or poor sound quality, several “[indiscernible]” may be passed after each other. Clients do not have to show each one, but could truncate them all into one, even if they have separate phrase ids.
- S: The speaker index of the entry. Can be missing, especially for low-confidence phrases. See below for how to handle paragraph-level speakers.
- ot: The original text, as masked by “[indiscernible]” in t, due to low confidence of transcription.
- c: The confidence of the original text, if any.
- keep-alive: This can be sent at any point to indicate that the stream is still going, but nothing is necessarily being said. Clients should ignore these unless they have some logic for stand-off or similar. In practice this will rarely if ever be sent, but rather a file will be regarded active as long as it has not been closed.
- end: Indicates an end. Nothing can be added to a file after this. If this exists or appears in the file during the stream, clients should stop polling. It can, but does not have to contain metadata. If it does not contain metadata, like below, then a successful exit should be assumed.
- code: an exit code, 0=success, anything else indicates failure
- system_reason: a reason for ending. Good for debugging. Clients should not display this to the user.
- user_reason: a reason for ending. Clients could show this.
- interruption: Something went wrong with the live transcription, the restart will be attempted 3 times.
- time: Time, in seconds, from the beginning of the transcript when this occurred.
- restarting: Indicates whether there is an attempt at restarting the live transcription.
trueindicating a restart,falseindicating the live transcription can’t be recovered.
- section: A section delimiter, displays the start or end of some section.
- name: the name/id of the section. E.g.
predicted-qnaandpredicted-speech.
- name: the name/id of the section. E.g.
A section can come way after the speech that it refers to. A section is not guaranteed to exist. predicted-speech could end without it ever having been noted as started. It could start without ever being ended.
Refinement instructions (version 1.7 and above)
Refinement instructions are records with the fieldss, e, rt and i. Not all of them are always present, it depends on the type of the record.
- i: The instruction type. Possible values are:
- word-update: Updates a word based on the words timestamp.
sis the timestamp of the word to update,rtis the new word Example:{"i": "word-update", "s": 123.45, "rt": "Tomato"} - word-insert: Inserts a word at the given timestamp.
sis the timestamp of the word to insert,rtis the new word Example:{"i": "word-insert", "s": 123.45, "rt": "Tomato"} - word-delete: Deletes a word at the given timestamp.
sis the timestamp of the word to delete Example:{"i": "word-delete", "s": 123.45} - paragraph-insert: Inserts a paragraph break, such that the words with s < the timestamp of this instruction and words with s >= the timestamp of this instruction should be in separate paragraphs. Example: This paragraph insert should divide the paragraph so that “Hello there”. is in the first, and “My name” is in the second.
- word-update: Updates a word based on the words timestamp.
The instructions often come in chunks, and when applied one after another can perform more complex edits. For example, if a word TomHanks needs to be split into two words, it will be done with one word-delete instruction followed by two word-insert instructions that divide up the time range of the original word.
How to access this data
REST API
Query with filtering by ticker, ISIN, date, and more.
Webhooks
Subscribe to live.transcript.updated events for real-time updates.
Example
Below is an example of how to consume a live transcript. The same example is also available at CodePen. Just make sure to replace thetranscriptUrl with your own.
html

