
Subtitling Groucho: A Case Study


While deciding on a subtitling project to hone my skills, my mind immediately went to a particular 1959 episode of What’s My Line?, a live panel game show that ran from 1950–1967. 60 years on, this episode has managed to amass over one million views on YouTube, due to the antics on one Groucho Marx.

The Show

What’s My Line? is generally a genteel, sophisticated show, especially for the game show genre. The regular panelists are Bennett Cerf, founder of Random House and publisher of Joyce, Faulkner, and many other 20th-century literary giants; Dorothy Kilgallen, a journalist and radio personality known for her columns and her coverage of the Sam Sheppard case; and Arlene Francis, a Broadway actress and fixture of the first couple of decades of television. It is moderated by John Charles Daly, a news anchor for CBS. The game usually proceeds with the panelists, accompanied by a fourth panel member, asking yes-or-no questions of a guest in order to figure out their occupation, which the audience has seen and they have not. Proceedings are normally quite formal and dignified, with perhaps a few laughs garnered when someone asks a particularly amusing question in light of the contestant’s occupation. There is also always a mystery celebrity guest, who disguises their identity, which the panel then has to guess.

The Episode

But in this episode, normal gameplay is out the window, because Groucho Marx comes on, in his first of two appearances on the show. He brings with him his comedic sensibilities, honed in vaudeville, and a complete disregard for the rules of the game. Everything and everyone is thrown off-kilter, and this creates every potential issue for subtitling that you could think of. There are multiple people speaking at once, interruptions, forced narrative, dialogue that is both overly short and overly long to fit in a subtitle according to Netflix’s guidelines, and even the use of a foreign language. This last issue, however, was not Groucho’s doing—this episode’s live broadcast happened to have taken place the week Khrushchev visited the United States, and there are numerous topical references throughout the episode, including celebrity mystery guest Claudette Colbert deciding to disguise her identity by answering question with “da” or “nyet.”

These issues also have to be dealt with in different ways, depending on the language. Certain things require subtitles in some languages, but not others. For this project, I have chosen Russian, but I have considered the ways that these issues may be handled differently in other languages.

Subtitling Issues

Foreign Dialogue

According to Netflix’s guidelines, whether or not foreign dialogue is translated depends on the context. Is the viewer meant to understand? Was it translated in the original? In this case, although if you watch the episode you can see that Dorothy Kilgallen struggled, we are supposed to understand, and therefore, it should be subtitled. But it is not necessary to translate “da” and “nyet” for a Russian audience. For any other language, however, we would have to follow Netflix’s guidelines for text in a foreign language. In this case, it would be best to simply translate into “yes” and “no” in the target language, although for English closed captions, “da” and “nyet” would be best. If we translate “da” and “nyet,” we would also indicate that Colbert is speaking in Russian in the subtitle as per the guidelines.

Onscreen Text

Onscreen text is a consistent presence in this show. The contestants write their name on a chalkboard, their occupations (or name, in the case of the mystery guest) is flashed on the screen, and there are placards in front of each panelist with their name on it. In the case of on-screen text it should be subtitled only when essential. This is known as a “forced narrative” subtitle. Subtitling the name placards would be distracting, and would not really add to the viewer’s understanding, since the panelists are frequently referred to by name. The professions are obviously necessary for the viewer’s understanding of the game, and should be translated in each case. In addition, those involved in the game generally stay quiet while this text is on screen, because the audience at home is meant to be reading the text (although there may be laughter from the audience if the occupation is unexpected or amusing).

Names, however, are another story. Daly always says the contestant’s name, so there is no need to subtitle the chalkboard. As mentioned above, the mystery guest’s name is also shown in additional onscreen text, just like the occupations of the other contestants. A subtitle for this text would not be required in a language that uses the Latin alphabet, since the name would be written the same way. But for Russian or another language that does not use the Latin alphabet, it should be subtitled to ensure that the viewer understands who this person is, as in the clip below. Timing is important here, because here, at least, conversation begins before the name has left the screen. I also had to move the dialogue in the subtitle after the forced narrative style to the top position, because the onscreen text remained and it interfered with the subtitle.

Multiple Speakers, Interruptions, Inaudible Dialogue

The game was filmed live, and conversations occur naturally. There is no script. In addition, in this episode, we have Groucho interrupting and making side comments to the other panelists throughout the show. The latter also sometimes requires judgment, because the side conversations are not always easy for the viewer to hear. The standard I used while transcribing was that if I could understand it, it should be subtitled. The various interruptions and people talking over one another require some finesse when deciding where certain dialogue should go, especially since Netflix’s guidelines require no more than two lines per subtitle.

Dialogue That Is Too Short or Too Long

The “yes” and “no” answers of the game tend to be short, especially during Colbert’s segment, where she is not offering any additional commentary, since she is answering in Russian, so her answers are short. Generally, this can be solved by combining it with the subtitle for the next line, especially when the next person speaks immediately after her. You can see some examples of combining short dialogue with long in the video above. With someone like Daly, length can be extremely challenging, since he tends to be verbose, sometimes to the point of exaggeration.

An example of a Daly response:

I think if you’re going to pursue this line of questioning, we must first elicit, or at least find the term of reference in which the question is asking — if you are speaking of physical contact, or the normal social contact of the conduct of business, we would be able to help you.

John Charles Daly to Dorothy Kilgallen

This, on its own, is a challenge to subtitle, especially in combination with other people participating in the conversation. When you have to make cuts in order to fit guidelines, part of the viewing experience is lost. When I spotted the dialogue, it ended up being:

Subtitle 1: If you’re going to pursue this line, we must first elicit,
Subtitle 2: or find the term of reference in which the question is asking,
Subtitle 3: if you are speaking of physical contact, or the normal contact of business,
Subtitle 4: we would be able to help you.

This episode provides many challenges for the subtitler, even without getting into the challenges of actually translating the dialogue, since I used machine translation. Translating some of the jokes and an unintentional double entendre in the second segment of the show, featuring poor, very rattled Dorothy Kilgallen, would require an entire second essay in order to discuss how a subtitle translator would handle them. As it was, this project could truly serve as a crash course in addressing a range of issues that pop up during the subtitling process.

Special thanks to W. Gary Wetstein of the What’s My Line? YouTube Channel for providing me with his original file of the episode.

Machine Translation

Training a Neural Machine Translation Engine

Our team was tasked with training a neural machine translation engine using Microsoft Custom Translator. Our language pair was English to Russian. Based on a case study we read, we decided that a suitable candidate for this kind of pilot project was the press releases featured on the website of the U.S. Embassy in Moscow, due to the fact that they are informative texts intended for a general audience. We outlined these goals in our pilot project proposal.

Our hypothesis was correct, as we were able to train the engine to translate texts, with the end result requiring a minimum amount of post-editing. We still encountered issues along the way, which are explained in our updated proposal and presentation on lessons learned. While our results were excellent, we ended up having to add parallel corpus data from the United Nations in order to have enough data. We could have had even better results with data pulled entirely from the State Department. In addition, we found that the BLEU score was not an entirely accurate representation of translation quality. A major issue in our test translation is that proper nouns were being translated when they should not be, i.e., “Huntsman,” as in “Jon Huntsman,” was being rendered as the Russian word for “hunter.” On the advice of an industry expert, we added a glossary, and while the BLEU score for this round was slightly lower, having the names of important people and organizations greatly reduced post-editing time.

Overall, this project provided valuable experience in training a neural machine translation engine. Our results show the important of domain when deciding whether a project is suitable for this kind of machine translation, and also the limitations of automated evaluation methods like a BLEU score in comparison to human evaluation.