Video captions and audio transcripts

Updated on Sep 12, 2022

This guide provides an overview of your responsibilities as an instructor to provide accessible content within your course. It also includes technical "how to" instructions to enable you to add captions to your video content (using automatic tools and processes for commonly used video hosting tools as well as options for captioning services).

Am I required to caption/transcribe all video content in my course?
What are the official Accessibility for Ontarians with Disabilities Act (AODA) provincial regulations?
What is the difference between captions and transcripts?
How do I generate captions for my video content?
1. Adding automatic captions to commonly used video hosting tools (Stream, MyMedia, etc.)
2. Generating captions using a captioning services

1. Am I required to caption/transcribe all video content in my course?

Currently, you are not officially required by accessibility policy to caption (or transcribe) your video content as long as it is part of a private environment (e.g. your Quercus course). This policy relies on the idea that if a student has an official accommodation need, they can register this with the U of T Accessibility Office and they will work with the student to ensure access to the course materials.

However, text support of your video content benefits all learners and it is highly recommended to provide captions for this content, if at all possible.

Captions and transcripts benefit all learners:

those with various auditory and learning abilities
non-native speakers
viewers watching videos on low bandwidth
viewers watching videos in noisy/quiet environments
those learning new terminology can use the captions and transcripts to improve comprehension, information processing, and retention

Video captions and audio transcripts align with the UDL guideline: providing multiple means of representation.

2. What are the official Accessibility for Ontarians with Disabilities Act (AODA) provincial regulations?

The University of Toronto is committed to the principles of the Accessibility for Ontarians with Disabilities Act (AODA). According to the Ontario Regulation 191/11, section 14:

By January 1, 2014, new internet websites and web content on those sites must conform with WCAG 2.0 Level A.
By January 1, 2021, all internet websites and web content must conform with WCAG 2.0 Level AA, other than,
- success criteria 1.2.4 Captions (Live), and
- success criteria 1.2.5 Audio Descriptions (Pre-recorded).

The information above is from the Web Content Accessibility Guidelines (WCAG) 2.0 website. There is much more to web accessibility than captions and transcripts and the guidelines cover a wide range of recommendations for making Web content more accessible.

3. What is the difference between captions and transcripts?

Captions	Transcripts	Subtitles
Captions are text versions of speech and other important audio content synchronized to the visual and auditory content. The most common type of captions is “Closed Captions,” which can be turned on or off via the “CC” button on video players.	Transcripts are text versions of speech and descriptions of important audio and visual information with no time information attached. Transcripts allow anyone who cannot access the web audio or video content to read a text transcript instead.	Subtitles, on the other hand, are text translations of speech and audio content.

For more information about captions, transcripts, subtitles, and audio descriptions, refer to the World Wide Web Consortium (W3C)’s pages on Captions/Subtitles, Transcripts, and Description.

4. How do I generate captions for my video content?

There are two main ways to generate captions for video content - you can either:

use an automatic captioning tool (usually no charge, but has some inaccuracies)
use a captioning service (usually with a charge, but closer to accurate)

4.1. Adding automatic captions to commonly used video hosting tools (Stream, MyMedia, etc.)

If you have video content, you've likely already chosen a service to host the videos (see select a video hosting/streaming service) based on your parameters on sharing video content to your courses. Based on your selection of video hosting tool, you'll undertake different processes for enabling captions - some are more work than others.

How to:

Adding captions to MyMedia-hosted video content - MyMedia does not have a built-in captioning tool, so it requires more work than just enabling the auto captions. You have to generate a caption file via another service and then upload this caption file to your MyMedia video.
Enabling captions to Microsoft Stream-hosted video content - Stream does have an auto-captioning tool, but you need to generate the auto-captions on your video before they can be turned on/off by those viewing the video.
Enabling captions in YouTube-hosted video content - While you should always consider the geographic availability of third-party tools (e.g. YouTube is not accessible in China), this tool might be useful to your for supplemental video hosting and for the generation of auto-caption files that you can download from YouTube and upload to MyMedia.

4.2. Generating captions using a captioning services

Unlike auto-captioning performed by AI, a captioning service usually supplements this process with someone checking the captions for accuracy. This service comes with a fee (usually per minute) and are external to the University. We do not currently have an internal captioning service. When using a captioning service, you will want to evaluate where your content is stored for captioning, what access you are required to provide to your content, and if any part of your content is permanently stored by the third party service.

How to:

NVivo - NVivo software is designed to help researchers organize, code, and analyze qualitative and mixed methods research data. NVivo 12 is under site license at the University of Toronto and is available to faculty, staff, and students at no cost - but transcription services are pay-as-you-go or available as packages (see NVivo transcription services).
Otter.ai - Otter.ai is an application that generates speech to text transcriptions using artificial intelligence and machine learning. Please note: Otter is owned by AISense. Before using the service, read AISense’s Privacy Policy to ensure you are okay with their data collection.
3Play Media

The content on this page was modified for FASE from the TATP's Video Captions and Audtio Transcripts website.

Previous Article Synchronous Sessions

Next Article How do I download a caption file from YouTube?

Support from FASE's Education Technology Office