Games CC
Everyone wants to save the universe. Click to visit

Interview with Valve's Yahn Bernier - June 28th, 2005

questions by Reid Kimball


Here at Games[CC] we take great interest in researching and developing solutions for adding closed captions into games. While playing Half-Life 2 by Valve Software, it was evident they had done their research and took the issues of closed captioning seriously. Games[CC] and the IGDA Game Accessibility SIG was interested in learning more about the experiences Valve Software had while creating their custom closed captioning system for Half-Life 2. It wasn't long before Yahn Bernier replied to the various questions we asked and his responses are well worth the read.


1. How long did it take to design and program the captioning system, the system that recognizes a sound has played and to display it's relevant text on screen given various criteria (if any, such as distance away from player).

It probably took two weeks of my time to implement and refine the systems code part of this. Because we changed all of our sounds to play via an EmitSound() call and that call takes a shorthand 'name' for the sound, it was easy to make that name be the captioning lookup key as well. We also knew that we wanted certain sounds to display in different colors (effects versus npc's speaking, etc.) so we added the ability to embed simple HTML like tags into the actual localized caption text (including coloration, bold/italics, line breaks, non-default "linger" time, etc.). All of the code for the system is built into the game and client .dlls and is in the public part of the SDK code. I'd look there to see how we implemented this stuff. Specifically hud_closecaption.cpp/.h in the cl_dll and the EmitSound code and EmitCloseCaption code in the game .dll (sceneentity.cpp makes use of this for acting scenes). Also we drew a distinction between subtitling and close captioning (subtitling just being dialogue, as if you weren't hearing impaired but were listening to the dialogue in a foreign language for instance).

2. What was the most difficult aspect of the system to implement?

Tuning the system was much harder than actually programming it. Marc Laidlaw, our writer, and Bill Van Buren, who worked closely with our voice actors, had to go through and tag a lot of the dialog in the script. Luckily everything was in one centralized Unicode text file so he could work in there as needed. Marc and Bill had to spend a lot of time watching the captions in the game and tuning them, especially captions for weapon and environmental sounds. We allowed each caption to specify how long had to transpire before the same caption would be seen again. Something like a machine gun, therefore, would show up with a single line caption every few seconds instead of a steady scroll of captions.

3. I believe Valve has a custom tool that allows captions to be created for a sound file. How long did that take to develop? Was the development worth it, did it save time in the creation of the closed captions?

We used faceposer to extract phoneme data for sound files (the extractor part of it is actually a separate .dll, so we have some command line tools to do batch processing which we used for phonemes in the localized versions of the game). One of the steps was to type in the text of a .wav as an initial hint to the extraction system. That system drives the facial animation. The text and phonemes are stored directly in the .wav file (the .wav file format is RIFF which allows custom chunks to be embedded in the .wav). We were able to use one of our tools to extract the phoneme related text from these .wav files and use that as a first pass at the english captioning data. It was an unintentional benefit of the facial animation system that we had most of the English captions roughed out automatically.

4. Is there anything you would design/implement differently if you were to design another captioning system?

There were a few things I read about on-line at The main thing we didn't put in the UI was a history view or a way to dump out the captions to a text file so you could read through the transcript like a screenplay.

5. Was there any discussion about the use of colors in the captioning text and how other cultures may perceive those color assignments?

Yes, actually. We initially only colored world effects differently from speech. All speech was white. When we had hearing impaired testers come in, their feedback was that it was difficult to figure out who was speaking since all of the text was white. At that point, we went back into the captions and added coloration tags to each main NPC in the game to differentiate them from each other during acted scenes. I don't believe that we looked at cultural perception issues with the colors when they were chosen. That would be a good question for Greg Coomer and Marc Laidlaw.

Thanks goes to Yahn and Valve Software for sharing their experience in creating one of the most comprehensive closed captioning systems for games.