menu icon
post imagepost image

The (Un)Written Closed Captioning Standards and Protocol

According to the statistics, people watch 6 hours and 48 minutes of online videos every week. Another one says that the average video length is 4 minutes and 20 seconds

Now, this is a rough estimate, but let’s indulge it for a second.

This means that an average person watches around 100 videos a week. That’s a LOT of videos, right?

Now, even if people aren’t video creators or pros, if they are watching that many videos a week, they will start noticing things. Their eyes get trained, in a way.

For example, in most cases, they will be able to tell apart a professionally made video from a smartphone video. Fake or staged videos don’t pass as ‘lucky shots’ that easily anymore. 

People are becoming increasingly experienced ‘video watchers’. Another interesting part about this is that many viewers probably couldn’t even explain how they can tell a pro video apart from an amateur one. 

So, where do the subtitles and captions fit into this story (subtitles and captions aren’t the same), but that’s another story and we’ll treat these two terms as synonyms where applicable)? Well, they are a part of the video, aren’t they?

For an untrained eye - all captions are the same. If they are reasonably correct - they are fine. 

BUT, in the light of what we just said about an average person becoming a more ‘educated video watcher’, would you be willing to bet that ‘amateur style’ captions would go unnoticed?

Typos are just one thing that will put your viewers off. There are actually many subtleties involved in proper subtitling.

TV Networks have subtitlers that deal with these and make sure they are set correctly. They have elaborate guidelines about how to do proper subtitling and captioning. If you don’t mind reading all those in great details, here are some of them for you:

Guidelines and Best Practices for Captioning Educational Video

BBC Subtitle Guidelines

A Proposed Set of Subtitling Standards in Europe

Web Accessibility Initiative

….or, you can use this reader’s digest version of all that content and do your best to make pro-looking captions and/or subtitles.

What Makes Good Closed Captioning?

According to the 2014 mandates by the Federal Communications Commission, for captions to be high quality, they need to be:

Accurate - this means no mistakes, typos, or other types of inaccuracies. The industry standard is 99% of accuracy. That’s why you need to review your captions and subtitles, even if you are certain that they are error-free. However, if the speaker is using incorrect grammar, you should not correct it in your captions.

Consistent - for better understanding, all features of your captions need to be consistent throughout the content piece. You should not change fonts or colours unless these changes have a specific meaning. For example, you may choose to use different colours for different speakers.

Clear - captions need to convey who says what, but also to make non-speech information obvious. Their role is to add clarity and help people understand what’s happening on the screen, even if they can’t hear.

Readable - this is more than just picking a good font. It is also about accurately syncing captions with audio, but also about keeping them on screen for long enough, so they can be read.

Equal - sometimes, this can sound a bit like accuracy. The idea is for the captions to convey the meaning and intention of the material they represent. 

Closed Captioning Style Guide

The best captions and subtitles are those that hit that sweet spot between adhering to all the captioning rules AND being branded. It doesn’t always necessarily work out, but there are some ways you can make it happen.

You already know your brand guide, so let’s see what you need to think about when styling your captions

Font Rules

  • DCMP recommends using white characters in a translucent box. They recommend using medium weight, sans serif fonts with a drop or rim shadow.
  • Use the most readable font from your brand fonts.
  • BBC recommends using system fonts for online use. More precisely, Helvetica for iOS and Roboto for Android.
  • Fonts should be used in mixed case characters. All caps means shouting.
  • Spell out numbers up to 10.

Caption Position

  • As a general rule of thumb, you should only use two lines per frame.
  • Captions should be cut by two frames to each side.
  • If you need to split the line of your captions, try doing it by following grammar rules.
  • Whenever you can, make sure your lines are equal in length.
  • Usually, your captions should be at the bottom of the screen.
  • Try to avoid covering important details of the video with your captions.
  • BBC recommends using 37 characters per line for Teletext. Channel 4 recommends up to 42.
  • For online use BBC says that line length should be 68% of the width of a 16:9 video and 90% of the width of the 4:3 video.


  • BBC suggests 160-180 words per minute.
  • DCMP recommends that captions should last at least 40 frames, but no longer than 6 seconds.
  • Ideally, match subtitles to the pace of speaking.
  • Change subtitles with the change of scene.

Word Play: Use Caption Format to Add Value to Your Video

There is so much more going on in your captions than just speaking! All those sounds, music, accents…there’s so much important information that you should be conveying with your captions. 

Speaker Identification

There are different ways to do speaker identification. It varies from one authority to another, but also very much depends on which media you are using. Here are some common rules.

According to DCMP,  when two people are speaking simultaneously, you should place the captions beneath the speakers. If this is not possible, use different timecodes for different speakers. They don’t recommend using hyphens. You can also use capitalised proper nouns for identifying speakers. 

BBC recommends several techniques for identifying speakers in their captions:

  • Colour
  • Single quotes to indicate out-of-vision speaker
  • Arrows that indicate sounds that come from out-of-vision space
  • Labels
  • Horizontal positioning
  • Dashes - this is not recommended and should be used when it’s not possible to avoid it

Speech Specifics

  • According to the BBC, whispers should be labeled or put into brackets.
  • Sarcasm is denoted by a (?) 
  • Occasionally, you may add an all caps word to emphasise. For online use, BBC experiments with using italics instead of caps for emphasis.
  • Accents should be indicated when it’s essential for understanding the broader context. 
  • Sometimes, you can use brackets to explain emotional state - e.g. [angrily] 
  • Use brackets to denote when a speaker is speaking in foreign language

Non-Speech Sounds and Music

  • Sounds that aren’t music can be described in brackets if they are necessary for understanding the video content. You can skip this if the sound is obvious. You can use onomatopoeia if it helps to define the sound better.
  • If possible, use italics to note that the sound is out-of-screen.
  • You can use punctuation to describe the rhythm of the sound (eg. thud…thud…thud)
  • Background music can be described in brackets. If possible, you can include the performer and the title. 
  • If music comes off-screen, it should be in italics
  • If you plan on spelling out lyrics, use ♪ before and after them.

Looks like a lot, right? There are quite a few closed captioning standards and protocols and not every subtitle format will be able to support all of them. You should do your best to incorporate all of them in your captioning, especially if you truly want to make your content accessible and reach audiences that are deaf or hard of hearing. 

Related stories

article image

Short-Form Video ROI: TikTok Remains One of the Top Performers

clock icon
mins read
View Top Performer