That title means both “words and pictures” and “words in pictures,” because both phrases describe comic books. Although not all comics include words, essentially all superhero comics do. (A near exception, the five-page “Young Miracleman” story in the back of Miracleman #6 includes two talk balloons each containing the transformation-triggering word “Miracleman!” and a range of sound effects, newspaper headlines, and signage.) How images and text work together is one of the most complex and distinctive qualities of the form.
Words in comics have their traditional linguistic meanings, but they are also drawn images that must be understood differently than words in prose-only works. Their line qualities and surroundings influence their meanings. Dialogue and narration are traditionally rendered at a later stage of production by a separate letterer, after the penciler and inker have completed their work. The size, shape, and color of lettering can denote volume, tone, or intensity, especially when representing speech. Bolding is especially common, typically multiple words per sentence. Sound effects, however, are drawn by primary artists as part of the images. These are onomatopoeic words or letters that represent sounds in the story world. Often the lettering style is so expressive it communicates more than the letters’ linguistic meaning.
Words are typically framed within a panel. Spoken dialogue appears in talk balloons (traditionally an oval frame with a white interior), internal monologues in thought balloons (traditionally a cloud-like frame with a white interior), and unspoken narration in caption boxes (traditionally rectangular and colored, though sometimes narration appears in separate caption panels or in white gutters). Adding a pointer to a word container and directing it at an image of a character turns the words into sound representations or, if a thought balloon, into representations of an unspoken but linguistic mental process, both linked to the specific place and time of the depiction.
The absence of a pointer on a caption box indicates that the words originate from outside of the depicted scene. First-person narration with no pointer may be linked to a remote setting if the words are composed by a character from some other, implied moment and location that is not visually depicted. Though the words in talk balloons are understood to be audible to characters, the drawn words and containers are not visible within the story world even when drawn blocking story elements. As with lettering, the size, shape, and color of containers communicate additional meanings about the words. For talk balloons, the graphic quality of the balloon edges denotes how the words are thought, spoken, whispered, shouted, etc. Finally, the containers create semantic units similar to line breaks or stanzas in poetry.
Words also influence and are influenced by surrounding images that are part of the subject content. Pioneering comics artist Will Eisner identifies two kinds of images: a “visual” is a “sequence of images that replace a descriptive passage told only in words,” and an “illustration” is an image that “reinforces (or decorates) a descriptive passage. It simply repeats the text” (132). Scott McCloud goes further, identifying seven “distinct categories for word/picture combinations” (). Two of McCloud’s categories, “word-specific” and “duo-specific,” correspond with Eisner’s “illustration,” while the other five (picture-specific, intersecting, parallel, independent, montage) fall under Eisner’s “visual,” which “seeks to employ a mix of letters and images as a language in dealing with narration” (139).
To indicate the level of image-text integration, we combine and arrange McCloud’s and Eisner’s categories in a spectrum, beginning with the highest level integration.
Montage visual: images include words as part of the depicted subject matter. This is the only instance in comics in which words are part of the story world. All other words are discourse only.
Interdependent visual: images and words communicate different information that combines.
Intersecting visual: images and words communicate some of the same information, while also communicating some information separately.
Image-specific visual: images communicate all information, while words repeat selected aspects.
Word-specific illustration: words communicate all information, while images repeat selected aspects.
McCloud also includes two categories that are not integrated, and we add two more.
Duo-specific illustration: images and words communicate the same information. Although this might appear to be the most integrated category, there is no integration if each element only duplicates the other so that no information is lost if either element is ignored. Words and images are independent.
Image-only visual: isolated images communicate all information. Since comics do not require words, this is the most fundamental aspect of the form.
Word-only text: isolated words communicate all information. This requires the highest level of reader visualization, an approach at odds with graphic narratives as a form.
Parallel visual: images and words communicate different information that do not combine. This requires the same level of reader visualization as word-only texts, but the presence of images complicates and potentially interferes with that visualization.
With the exception of the most integrated category, montage visuals, all combinations of words and pictures produce some level of image-text tension because words, unlike images, exist only as discourse. Though drawn on the page, words are not visually perceptible to the characters in the story. Images, however, depict content that is perceptible to characters, so drawn objects and actions appear as both discourse (ink on paper) and diegesis (the world of the story). A drawing of a superhero flying (discourse) communicates the fact that the superhero is flying in the story (diegesis). The words “the superhero is flying” communicate the same diegetic fact, but the ink-formed letterforms bear no resemblance to their subject matter. There is no overlap between diegesis and discourse. Since both words and images are made of ink lines on paper (because printed words are images), some lines in a comic exist only in the reader’s world and some appear to exist in both the reader’s and the characters’ worlds.
Graphic novels create further image-text tension by highlighting the potential gap between text-narration and image-narration. In graphic memoirs such as Art Spiegelman’s 1980-1991 Maus, Marjane Satrapi’s 2003 Persepolis, and Alison Bechdel’s 2006 Fun Home, the text-narrator and the image-narrator are understood to be the same person, the actual author. When a character in a graphic novel controls the first-person text-narration in caption boxes, it is not necessarily clear whether that character is also controlling the image-narration in panels. If the words are generated by an omniscient third-person text-narrator, does that same narrator generate the images, or are the images generated by a separate narrator?
Unintegrated image-texts imply a separate text-narrator and image-narrator. In the case of a duo-specific image-text, the two modes of narration duplicate information without any integration, as if two narrators are unaware of each other. Integrated image-texts, however, imply a single narrator controlling both words and images in order to combine them for a unified effect. At the center of the spectrum, a word-specific illustration implies an image-narrator aware of text but a text-narrator unaware of image. Similarly, an image-specific visual implies a text-narrator aware of images but an image-narrator unaware of text.
Parallel visuals are more complex; although the two narrations are independent and so seemingly unaware of each other at the level of the panel, the overarching effect is integrated. In such cases, the separate text- and image-narrations may create a double image-text referent, in which a word has one meaning according to its linguistic context but, when read in the context of the image, acquires a second meaning. Alan Moore is best known for this approach, having perfected it with Dave Gibbons in Watchmen.