Cleaning a Whisper Transcript

Whisper AI will generate transcripts with a variety of flaws and issues. However, to save time, prioritize fixing certain issues over others.

Goals for Cleanup

  • Whisper transcripts should be as accurate to the audio as possible. No additions or omissions.

  • Whisper transcripts should be readable. Grammar should be accurate enough so that it is understandable without audio and can be read.

  • Whisper transcripts should be searchable. It is expected that users will search for specific keywords such as the name of a person or the title of a work. Therefore, these names should be spelled correctly.

How to clean a transcript

  • Remove all hallucinations. Hallucinations can be identified by blocks of red text.

  • Fill in omissions. This can be done either by listening to the audio and transcribing by hand or by splicing together multiple AI-generated transcripts. If you do the latter, document where you added text and what file you pulled your addition from.

  • Capitalize proper names and beginnings of sentences.

  • Add punctuation as needed, especially periods and punctuation used within the title of a work.

  • Make sure the following are spelled correctly: Names of individuals, places, organizations, events, groups, works. This includes fictional varieties of each (ie characters, fictional organizations, imaginary places).

  • Whatever else may be relevant depending on the collection, what it emphasizes, and how users are expected to interact with it.

Things not to prioritize due to time constraints

  • Standardizing spelling of words with regional varieties.

  • Adding commas, semicolons, and other punctuation that are not required for the transcript to be understood.

  • Ensuring common connecting words are accurate (ie “and” versus “then” when the transcript makes sense with either).

Things to NOT do

  • Removing filler words like “uh” and “um” unless they are so often used, the transcript becomes difficult to read.

  • Correcting the grammar of the speaker or otherwise changing the transcript.

  • Correcting misquoted/misremembered titles, names of individuals, concepts, quotations from elsewhere (just make a note for the metadata).

  • Censoring words (just make a note of harmful language for the metadata).

  • Removing parts of the transcript deemed “unnecessary” such as commercials or off-script chatter.