menu icon
post imagepost image

Captioning Sound Effects, Music and Key Rules

Captions serve as a written representation of the spoken content in a media file, making video content accessible to individuals who are deaf or hard of hearing. They provide a text-based track synchronised with the audio, either as a complement to or substitute for the spoken words.

While primarily composed of dialogue, caption files also incorporate non-verbal elements such as speaker identifications and sound effects, which are essential for comprehending the storyline.

Captioning rules and guidelines

The key captioning guidelines:

1. For better readability use mixed case characters and capital letters for screaming or shouting.

2. Keep font size, weight, and style consistent throughout the media file.

3. When dividing a sentence into multiple lines of captions, ensure it breaks at a natural pause in speech, where it is logically appropriate.

4. Avoid separating a modifier from the word it is intended to modify. For example:



John amended his corporate

John amended
his corporate guidelines

5. Do not break a prepositional phrase. For example:



The meeting starts at
five o'clock

The meeting
starts at five o'clock

6. Do not break a person's name including any associated title. For example:



Bob and Linda
Jones are at the meeting

Bob and Linda Jones
are at the meeting

7. Do not break a line after a conjunction. For example:



Bob kicked off the meeting, but
got interrupted

Bob kicked off the meeting,
but got interrupted

8. Avoid concluding a sentence and commencing a new one on the same line unless they are brief, interconnected sentences comprising only one or two words. For example:



The meeting
has started. Every

attendee has received
an agenda and pencil.

The meeting
has started.

Every attendee
has received an agenda and pencil.

Captioning sound effects

Sound effects refer to auditory elements distinct from music, narration, or dialogue within a media production. They are captioned when deemed essential for comprehending and enhancing the viewer's experience of the content.

Do not include background sound effects in captions when they are not crucial to understanding the plot or enhancing the viewer's experience.

When detailing sound effects, it's important to enclose the source of the sound within brackets. An exception to this rule is when the source of the sound is clearly visible on screen.

Sound effects can be paired with onomatopoeia

Described sound effects can be paired with onomatopoeia and both, the described sound effects and onomatopoeias, should be written in lowercase.

The described sound effect should be on the first line of the caption, distinct from the onomatopoeia. Onomatopoeia is a figure of speech where a word imitates the sound it represents. For example, words like "buzz," "hiss," and "moo" are examples of onomatopoeia because they mimic the sounds associated with bees, snakes, and cows respectively.

Off-screen sound effects

When sound effects occur off-screen, they should be displayed with italic font type if available.

In the case of offscreen sound effects, it is unnecessary to reiterate the source of the sound if it recurs in subsequent captions, producing the same sound.

Speed and pace of sound effects

Repeating the sound effect and using descriptive words can imply a rapid pace. For example:

[footsteps rapidly approaching]

whoosh whoosh

Using punctuation can also indicate speed or pace of sound. For example:



[doorbell ringing]

[doorbell ringing]
ding, ding, ding

When describing an abrupt sound, use the third-person verb form. For example:

[dog barks]


Repeated words

If a sound is represented by a repeated word, it is not hyphenated. However, if a sound is represented by two different words, it is hyphenated.

For example:

[doorbell ringing]
ding, ding

[doorbell ringing]

Sound descriptions

Opt for precise rather than ambiguous, broad terms to articulate sounds. For example, instead of saying [bird singing] try to be more specific, for example, [sparrow singing].

Do not use the past tense when depicting sounds. Captions should align with the sound and thus maintain the present tense.

Use age-appropriate vocabulary that is familiar to the intended audience's age group.

For example:

[dog barking]

[phone ringing]
bring, bring

[baby sneezing]

[clock ticking]

[bird calling]
tweet, tweet

[car racing]

Captioning background music

Describing instrumental/background music is necessary only when it's essential to the understanding of the content and by following the rules below:

  • the captions should include the performer/composer and the title if possible.
  • offscreen background music description should be italicised
  • indicate the mood and be objective as possible
  • avoid subjective words, such as "delightful," "beautiful," or "melodic"
  • use a music icon (♪) to describe nonessential background music
  • do not caption background music with a duration under 5 seconds
  • If music contains lyrics, caption the lyrics verbatim. The lyrics should be introduced with the name of the artist and the title in brackets, if the presentation rate permits. For example: [Elton John singing “Rocket Man”]
  • caption lyrics with music icons (♪) with a space - one music icon at the beginning and two icons at the end of the last line of a song. For example: ♪ The Leslie mic is still on, apparently ♪♪

In various regions across the globe, the terms "captions" and "subtitles" are used interchangeably, particularly in European and Latin American countries. However, in the United States, a clear distinction exists between the two. Captions are intended for viewers who are unable to hear and are often indicated by a "CC" icon on video platforms or remote controls. Conversely, subtitles cater to viewers who can hear but may not understand the language spoken in the video. Unlike captions, subtitles typically omit non-verbal audio elements. See more about difference between the captions and subtitles.

Subly can help you create accessible content with highly accurate closed or open captions, transcripts, audio descriptions, and improved colour contrasts that boost the accessibility of your video files.

Related stories

article image

Short-Form Video ROI: TikTok Remains One of the Top Performers

clock icon
mins read
View Top Performer