That title means both “words and pictures” and “words in pictures,” because both phrases describe comic books. Although not all comics include words, essentially all superhero comics do. (A near exception, the five-page “Young Miracleman” story in the back of Miracleman #6 includes two talk balloons each containing the transformation-triggering word “Miracleman!” and a range of sound effects, newspaper headlines, and signage.) How images and text work together is one of the most complex and distinctive qualities of the form.
Words in comics have their traditional linguistic meanings, but they are also drawn images that must be understood differently than words in prose-only works. Their line qualities and surroundings influence their meanings. Dialogue and narration are traditionally rendered at a later stage of production by a separate letterer, after the penciler and inker have completed their work. The size, shape, and color of lettering can denote volume, tone, or intensity, especially when representing speech. Bolding is especially common, typically multiple words per sentence. Sound effects, however, are drawn by primary artists as part of the images. These are onomatopoeic words or letters that represent sounds in the story world. Often the lettering style is so expressive it communicates more than the letters’ linguistic meaning.
Words are typically framed within a panel. Spoken dialogue appears in talk balloons (traditionally an oval frame with a white interior), internal monologues in thought balloons (traditionally a cloud-like frame with a white interior), and unspoken narration in caption boxes (traditionally rectangular and colored, though sometimes narration appears in separate caption panels or in white gutters). Adding a pointer to a word container and directing it at an image of a character turns the words into sound representations or, if a thought balloon, into representations of an unspoken but linguistic mental process, both linked to the specific place and time of the depiction.
The absence of a pointer on a caption box indicates that the words originate from outside of the depicted scene. First-person narration with no pointer may be linked to a remote setting if the words are composed by a character from some other, implied moment and location that is not visually depicted. Though the words in talk balloons are understood to be audible to characters, the drawn words and containers are not visible within the story world even when drawn blocking story elements. As with lettering, the size, shape, and color of containers communicate additional meanings about the words. For talk balloons, the graphic quality of the balloon edges denotes how the words are thought, spoken, whispered, shouted, etc. Finally, the containers create semantic units similar to line breaks or stanzas in poetry.
Words also influence and are influenced by surrounding images that are part of the subject content. Pioneering comics artist Will Eisner identifies two kinds of images: a “visual” is a “sequence of images that replace a descriptive passage told only in words,” and an “illustration” is an image that “reinforces (or decorates) a descriptive passage. It simply repeats the text” (132). Scott McCloud goes further, identifying seven “distinct categories for word/picture combinations” (). Two of McCloud’s categories, “word-specific” and “duo-specific,” correspond with Eisner’s “illustration,” while the other five (picture-specific, intersecting, parallel, independent, montage) fall under Eisner’s “visual,” which “seeks to employ a mix of letters and images as a language in dealing with narration” (139).
To indicate the level of image-text integration, we combine and arrange McCloud’s and Eisner’s categories in a spectrum, beginning with the highest level integration.
Montage visual: images include words as part of the depicted subject matter. This is the only instance in comics in which words are part of the story world. All other words are discourse only.
Interdependent visual: images and words communicate different information that combines.
Intersecting visual: images and words communicate some of the same information, while also communicating some information separately.
Image-specific visual: images communicate all information, while words repeat selected aspects.
Word-specific illustration: words communicate all information, while images repeat selected aspects.
McCloud also includes two categories that are not integrated, and we add two more.
Duo-specific illustration: images and words communicate the same information. Although this might appear to be the most integrated category, there is no integration if each element only duplicates the other so that no information is lost if either element is ignored. Words and images are independent.
Image-only visual: isolated images communicate all information. Since comics do not require words, this is the most fundamental aspect of the form.
Word-only text: isolated words communicate all information. This requires the highest level of reader visualization, an approach at odds with graphic narratives as a form.
Parallel visual: images and words communicate different information that do not combine. This requires the same level of reader visualization as word-only texts, but the presence of images complicates and potentially interferes with that visualization.
With the exception of the most integrated category, montage visuals, all combinations of words and pictures produce some level of image-text tension because words, unlike images, exist only as discourse. Though drawn on the page, words are not visually perceptible to the characters in the story. Images, however, depict content that is perceptible to characters, so drawn objects and actions appear as both discourse (ink on paper) and diegesis (the world of the story). A drawing of a superhero flying (discourse) communicates the fact that the superhero is flying in the story (diegesis). The words “the superhero is flying” communicate the same diegetic fact, but the ink-formed letterforms bear no resemblance to their subject matter. There is no overlap between diegesis and discourse. Since both words and images are made of ink lines on paper (because printed words are images), some lines in a comic exist only in the reader’s world and some appear to exist in both the reader’s and the characters’ worlds.
Graphic novels create further image-text tension by highlighting the potential gap between text-narration and image-narration. In graphic memoirs such as Art Spiegelman’s 1980-1991 Maus, Marjane Satrapi’s 2003 Persepolis, and Alison Bechdel’s 2006 Fun Home, the text-narrator and the image-narrator are understood to be the same person, the actual author. When a character in a graphic novel controls the first-person text-narration in caption boxes, it is not necessarily clear whether that character is also controlling the image-narration in panels. If the words are generated by an omniscient third-person text-narrator, does that same narrator generate the images, or are the images generated by a separate narrator?
Unintegrated image-texts imply a separate text-narrator and image-narrator. In the case of a duo-specific image-text, the two modes of narration duplicate information without any integration, as if two narrators are unaware of each other. Integrated image-texts, however, imply a single narrator controlling both words and images in order to combine them for a unified effect. At the center of the spectrum, a word-specific illustration implies an image-narrator aware of text but a text-narrator unaware of image. Similarly, an image-specific visual implies a text-narrator aware of images but an image-narrator unaware of text.
Parallel visuals are more complex; although the two narrations are independent and so seemingly unaware of each other at the level of the panel, the overarching effect is integrated. In such cases, the separate text- and image-narrations may create a double image-text referent, in which a word has one meaning according to its linguistic context but, when read in the context of the image, acquires a second meaning. Alan Moore is best known for this approach, having perfected it with Dave Gibbons in Watchmen.
Oooh, this:
“Dialogue and narration are traditionally rendered at a later stage of production by a separate letterer, after the penciler and inker have completed their work”
really needs some qualification, Chris. It’s true of some traditions, but by no means all. (I know that of course you know that, but…)
In fact, the traditional sequence in American comic books was pencils-letters-inks. This only changed with the coming of computerised fonts.
‘Sound effects, however, are drawn by primary artists as part of the images.’
By no means universally. Again, in traditional US comics the letterer was likely as not to draw the sound effects, even if roughly indicated by the penciller.
Thank you both for the clarifications. I’ll incorporate them in the book manuscript.
I love these posts, though i approach them with wariness; part of me thinks we’re shooting ourselves in the foot by reading ‘our’ medium this way, i mean with such a ‘factual’ vocabulary, postulating that the image communicates this and the text that thing, it may make reading for suggestiveness and ambiguity more difficult.
In response to ‘comics do not require the use of words’ i’d like to point towards the venerable Saint Augustine’s dialogue with his son where they discuss that to describe a thing you need something that is not itself that thing- in regards to comics, it might be interesting to investigate to what extent the more verbally-oriented comics activate our brain to see the pictures as words too or to re-descibe them to itself in words?
Finally, i think ( i’ll just keep saying this, sorry) that the notion of visuals acting as Narrator is deeply problematic. I’d say that comics even explores the potential ( and often the literal ( in the form of the gutter)) gap between image-narration and image-narration, i.e. the medium uproots the entire idea of a narrator- in that sense, as in many others, comics exhibits one of the key traits of Gothic fiction- the unreliable narrator. All narration in comics is unreliable since, being mostly visual, reading becomes an act of interpretation. The Watchmen panel at the end is a case in point- it suggests a point-of-view ( the assasin) yet that point would be roughly equivalent to a tilted camera just to the side of Veidt’s lower right hand jacket pocket; but the red tints the panel is coloured in do not indicate an assasin’s coolly detached plan being enacted- they are suggestive of the violence Comedian must feel being done to him; the colour also prefigures the bloody end of that fall, underscoring the poignancy of the complimentary lines from Rorsach’s diary; this panel might seem to ‘narrate’ from Rorsach’s point of view, but in fact it places the reader’s sympathy with three characters at once and with a kind of omniscient irony that points towards the artifice of the entire construction- which neatly seems to encapsulate Watchmen’s aims as a whole. Comics are a deeply artificial medium. It’s no wonder that some of the most interesting comics around are the ones exploring this artificiality to great effect ( Pim & Francie, Building Stories, Ice Haven).
Scattered thoughts, sorry to ramble.
Thanks so much, Ibrahim. And let me concur that I most enjoy levels of ambiguity in narrative. My hope is not to eliminate those levels, but to better appreciate them.
As far as image-narration, that’s definitely a concept I’m still working through. But you excellent analysis of that Watchmen panel describes what seem like narration qualities (POV, expressionistic coloring). For there to be a gap between between the text- and image-narrations, then (to state the obvious) image-narration must exist. The tripping point may be that “narration” suggests words, because it traditionally has only described words. But if images can tell a narrative, then the telling of that narrative is (to again state the obvious) a narration. And comics, as you say, are so much about the gap between the two kinds.
Well, i guess i set myself up for that, by describing that Watchmen panel in terms i imagined an analysis would use. The danger of using your interlocutor’s terminology to point something out is, of course, that it might seem you agree with that terminology. I don’t. i was trying to convey that while the panel certainly narrates, any description of that narration must be confined to what can be objectively enumerated about that panel- and that is very little. Taken out of context, it might well appear that the pair of hands is diving out after the Comedian, or it might seem that he’s actually crashing into a glass roof. In comics, just about every single thing is interpretation- let alone the emotional states of characters.
You are right to say that the panel contains ‘what *seems* like narration qualities’. Because the POV swims where there can’t really be a POV & the expressionistic colouring does not actually express anything except a vague mood of violent action & as such is merely a well-worn utilitarian device to hold the reader’s attention.
It is certainly true that, for there to be a gap between what you, Chris, call text- and image-narrations, image-narration must first be established as a discernable quantity. I, however, prefer not to call it that. It is my assertion that this thing we’re discussing might best be named something other, like ‘image- and text-interpretation,’ since calling it ‘narrative’ implies that what we see is actually there, or intended by the author to be there as we see it.
A tripping point is not the narration-suggests-words aspect, but the fact that images work in fundamentally different ways, which the word ‘narration’ does not comprehensively cover.
The history of art is not the history of literature.
Perhaps the division is text-narration and image-depiction? I agree that images require interpretation. But text also requires interpretation. Which is to say there is inference in both.
I fear as a term image-depiction though is largely singular. As soon as you have two images, ie sequential art, you have something akin to narrative. So while the history of art is not the history of literature, perhaps the history of narrative art is?
“Taken out of context” means taking the image out of its narrative sequence, which means taking it out of its role in narrating the story. You can do the same thing with any sentence from any novel, showing how the sentence’s now out-of-context referents require all kinds ambiguous interpretation.
Okay, full disclosure ( as if you hadn’t guessed by now), as a visual artist myself, i feel that comics as an art form will benefit more from a visual-arts approach than from any other, if we are to point towards the medium’s innate possibilities. & because of the relative preponderance of comics described in near-cinematic or near-literary terms, i think it’s useful to stress the other possible approach.
Mr. Mahendra Singh has made some very laudable efforts in that direction with his essay here on HU on Moebius’ inks; also that recent article in Art in America on Kirby; and mr. Santoro’s reading of the spread as a single unit should also be taken into account, his diagramming notwithstanding.
I think such an approach is not only useful- it is in fact necessary. How else can we discuss the work of Aidan Koch or Andrei Molotiu or Nina Roos, which are barely narrative?
If we describe the essence of comics as juxtaposition, then placing two images next to one another is not ‘akin to narrative’ but a game of contrast or convergence, in any case, a matter of composition- consequently, it does not matter if there are two images placed together or fifty or a hundred- the essential dynamic here is juxtaposition.
I think Ibrahim is right that there’s a tradition of abstraction closer to the visual arts in which image juxtaposition isn’t necessarily narrative.
Despite my going on about narrative, I’m also in agreement. I intend my image-narration analysis to be one tool in a toolbox that is increasingly focused on comics as a visual art. Although narration is a literary concept, I want to employ it visually.
I’m not familiar with Santoro’s reading of the spread, but I’ve had a similar thought. Do you have the essay title, etc.?
Maybe i had just assumed, from reading Santoro’s Posts over at the old Comics Comics site & his more recent diagram-heavy posts at his own Comics Workbook blog that, that he approaches the spread as unit? I seem to remember having read a few brief statements to that effect, though.
Even if there is no precedent for this way of reading to be found, anywhere, i’d be curious what the argument reversed would look like- what i mean is, how could one make a case for the spread as not being a compositional unit?
By the way, Chris, do you have a view or list of basic pacing modes? It seems to me for every hundred dialogue scenes in comics i read, there’s only three ways the talking heads are framed and intercut and spatially laid out- that’s insane; it’s like every single song you listen to has the exact same drum fill in the bridge, or something like that. Also, is ‘scene’ a good descriptor with comics?
Ibraham, Wally Wood (who worked for Marvel briefly in the 60s) has a great take on drawing dialogue with “Wally Wood’s 22 Panels that Always Work!!”
“Scene” is imperfect but I’m still willing to use it for lack of a better term. I like Cohn’s “visual sentences” too.
Milton Caniff could draw the hell out of dialogue, too.
Wally Wood’s “22 panels…” while useful advice if pressured by a deadline & in need of graphic short cuts, is actually part of the problem, in that it does not think beyond a standard back-and-forth scheme of shots. I did not mean drawing solutions so much as the panel-to-panel rhythm versus the cadence of the text…
The thing about ‘scene’ is of course that it’s a carry-over from theatre, so to take it still one medium further away from its source…
It’s not really a basic unit of division in comics the way it is in film ( it’s useful on-set to describe what you’re working on, but in comics the economics and logistics allow for many other ways to organize & chart the working process).
I agree with both of your points, Ibrahim. The units of comics discourse are panels, pages, and spreads, which diegetic scenes have no immediate relationship to.
And I think one reason dialogue is a problem is because it’s mostly non-visual. Most dialogue could be rendered in a column of back-and-forth prose next to a single image of two figures. If you use Cohn’s narrative grammar, most of the panel content isn’t really doing anything–except (in Eisner’s sense) “illustrating” the words.
Moore and Gibbons deal with that by interweaving dialogue with parallel scenes (sorry, don’t know a better words yet), which often creates double referents and so all kinds of interesting things are happening visually. You could do the same within the dialogue scene if the panel content focused on something other than the two talking figures (what, for example, might their feet be doing under the table simultaneously?). So the visuals then have their own story or subplot or visual sentence or whatever to depict beyond simply indicating the already established action of “they are talking.”
“Most dialogue could be rendered in a column of back-and-forth prose next to a single image of two figures” Sim and Gerhard often did this in Cerebus…
“the visuals then have their own story or subplot or visual sentence or whatever to depict beyond simply indicating the already established action of “they are talking.” …and Walt Kelly would do this in Pogo. While the “A” dialogue ran, he’d draw a parallel “B” routine of some other slapstick action or secondary dialogue (often using smaller characters like bugs or kids in the bottom part of the scene).
I was once a big Cerebus fan, lots of great stuff back there. Also lots of anti-feminist ranting.