Broadcast and the Hearing Impaired

closed caption complete

"I hope I inspire people who hear. Hearing people have the ability to remove barriers that prevent deaf people from achieving their dreams."

Marlee Matlin


Did you know that more than 37 million Americans aged 18 or older have some kind of hearing loss? And 30 million Americans aged 12 or older have hearing loss in both ears? With a media-rich society, that makes listening to narration, dialog, and speech in general difficult for them. Before 1972, anyone hard of hearing had to watch television with the volume turned up. That's when ABC and PBS teamed up and began offering alternative broadcasts of news programs with visible captions. Although "closed captioning" had been successfully demonstrated that same year, the hearing impaired would have to wait until 1980 before decoding devices and specially encoded programs became available.

Of course today, most television programs are CC'd, including live events like sports and award shows. Even online media sites such as YouTube have some kind of closed captioning. For the completely deaf, this is one of only a few options to enjoy television. If someone only has trouble hearing speech, then about the only other option right now is still the same as in 1972 – TURN IT UP. But there are better times on the horizon.

Researchers have confirmed that speech with music or sound effects in the background can be hard to discern, which is also one of the most common complaints to broadcasters.* Ben Shirley and Rob Oldfield, researchers at the University of Salford in the UK, have pointed out that not all the fault lays with the mixing engineers or broadcast station. It may be the hardware at home.

Most broadcast programming is now delivered in 5.1 surround sound. This six-channel format includes a separate channel just for dialog. But if the viewer is listening on a stereo TV, there may be some issues with how it is transforming those six-channels into two (downmixing). Shirley and Oldfield noted that some broadcast outlets simultaneously broadcast a stereo mix alongside a 5.1 mix. If so, then those stereo TV sets will automatically choose that one, usually with no problem. But a masking phenomenon sometimes occurs when the channel containing speech gets downmixed, creating a garbled, dull, and softer sound.

As an engineer who mixes for a broad number of formats, I can attest that it's impossible to create a mix that will sound great in all possible listening environments or on any device. The researchers in Salford note that because most programs are mixed in an optimal studio with professional monitors, the engineer easily hears the speech over background music/effects.

So is there a solution? Experiments in "object-based audio" look very promising not only for the hearing-impaired, but for an overall enhanced audio experience for any listener. Currently, audio is directed to one or more particular loudspeakers during the mix so as to create an environment. When you try to alter that into a different one (6 channels down to 2 for example), the balance is interrupted. The object-based solution is to tag certain audio events with location (and other) metadata that can be selectively chosen by the listener – sort of like web hypertext for audio. For instance, all speech could be tagged, certain sound effects, solo instruments in musical groups, and so on. The mixing engineer would lose some control of balance, but the listener experience would be more interactive and enriched. Sporting events could have the basketball rim, batter's box, or line of scrimmage tagged. The lead singer, background singers, and lead guitar in a rock group could be tagged.

Of course, this would change the way we audio engineers would collect, organize, and mix sounds. A live event would need certain microphones tagged, whereas the post mix engineer would embed particular audio events with metadata. The whole process, including software and hardware, would need to change its core. Can it be done?

I remember that when I first started out in this industry, digital was seeping into the studios. It wasn't really understood how it would be used, or to what depth. What seemed fantasy at the time was not even close to the way it really is today. The only concept of non-linear audio we had at the time was the ability to pick up a turntable needle and randomly place it anywhere on the record. The same might be said for the concept of object-based audio, but it's strikingly similar to a web browser, smartphone, or game console. This familiarity of those technologies makes me think it will be an incremental adjustment for audio engineers, not a "nuclear meltdown/Planet of the Apes rebuild." Although talking apes would be really cool – closed captioned of course.

* "Clean Audio for TV Broadcast: An Object-Based Approach for Hearing-Impaired Viewers" AES Journal JAES-D-14-00081_HR; BEN SHIRLEY, and ROB OLDFIELD, University of Salford, Salford, UK

Did You Know?


  • About 2 to 3 out of every 1,000 children in the United States are born with a detectable level of hearing loss in one or both ears.
  • More than 90 percent of deaf children are born to hearing parents.
  • Men are more likely than women to report having hearing loss.
  • Roughly 10 percent of the U.S. adult population, or about 25 million Americans, has experienced tinnitus lasting at least five minutes in the past year.
  • Hearing loss is a major public health issue that is the third most common physical condition after arthritis and heart disease.
  • On February 15, 1972, ABC and the National Bureau of Standards presented closed captions embedded within the normal broadcast of “Mod Squad.”
  • In 1972, “The French Chef” made history as the first television program accessible to deaf and hard-of-hearing viewers.
  • In 1980, there were only three captioned home video titles.
  • In 1990, a law—the Television Decoder Circuitry Act of 1990—was passed mandating that all televisions 13 inches or larger manufactured for sale in the U.S. contain caption decoders.
  • Real-time captioning requires stenographers who have been trained to type at speeds of up to 250 words per minute, giving viewers instantaneous access to live news, sports, and information.
  • A survey carried out by the BBC in 2010 indicates that 60% of viewers had difficulty in understanding speech on TV.
  • Some research suggests that there may be some benefits for television sound by the addition of a central loudspeaker, as is used in 5.1 surround sound systems, compared to a central “phantom” stereo image.

Neil Kesterson

UA-25904086-1 GSN-354889-D