The relationship between word and image has a long history; among the earliest books produced in medieval Europe were illuminated manuscripts, which combined textpainstakingly copied by hand-with elaborate illustrations on sheepskin. During the Renaissance, paintings often included painted representations of words spoken by the figures portrayed in the scene. These early examples of the integration of text and image embodied a coherent, Christian worldview; text was understood to represent the “Word of God,” and was thus believed to be inherently truthful. In the modern world, we are less confident about absolute truths. Today, we are likely to encounter words and pictures printed together in magazines, newspapers, or online to shore up the claims to truthfulness. In science textbooks, photographs and diagrams illustrate complex ideas. In children’s books and graphic novels, text and images create elaborate fantasy worlds. Billboard advertisements reduce text and images to graphic simplicity so as to deliver their messages with an economical punch. In official documents, text and images are also combined for purposes of bureaucratic record keeping; your passport and driver’s license use words and pictures to identify you, and this vital information is increasingly being integrated and uploaded into multimedia databases. In contemporary media, text and images converge to form a seamless package.1 As the French critic and semiotician Roland Barthes (1915-1980) wrote:
Text and image are related and complementary-but ultimately irreconcilable-ways of delivering information. Words and pictures offer different kinds of information. While we are taught to read from an early age, we have to learn how to see for ourselves.3 We scan images for visual information, which we then interpret based on our own, idiosyncratic knowledge and experience; words are visual in nature, but we decipher them according to a strict set of rules, usually from left to right and top to bottom. Words require that we are able to read a language and understand its vocabulary, grammar, and syntax. But as the philosopher Maurice Merleau-Ponty (1908-1961) wrote, written language
The arbitrary nature of language is explored by the theory of semiotics, which is concerned with signs, or the linguistic relationship between words (the signifier, e.g. “tree”) and the things they represent (the signified, e.g. a particular woody perennial, deciduous or coniferous, and member of the plant kingdom, or the Biblical “Tree of Life,” or alternatively a hierarchical data structure, figure, or graph that branches from a single root, as in a genealogical chart). As linguist Ferdinand de Saussure (1857-1913) noted, the bond between the signifier and signified (this structural unit is known collectively as the “sign”) is merely a matter of convention. There is no essential, underlying natural relationship linking our word for tree with the thing itself. Moreover, language is often “coded”—full of idiomatic expressions, clichés, metaphors, and figures-of-speech-so that the words we hear or read often refer to something else entirely, as in when we say that we fail to see the forest for the trees. Barthes used the terminology of semiotics to explain the difference between words and images; he wrote that images and words are different types of signifiers. As Barthes explained, seeing and understanding a photograph seems to be simple and immediate, because we believe that the experience is not at all like language, which takes time to read or hear, and which often seems to require an advanced degree in linguistics, literature, or a foreign language to fully understand. This is because language, unlike photography, is symbolic; words bear only an arbitrary relationship to the things they describe, whereas photographs bear a direct and causal relationship to the things they depict.5 Pictures (especially photographs) seem more direct, less complicated. Because photographs traditionally have their origin in an optical and chemical process that occurs in the presence of the things represented, we tend to believe, quite logically, that they constitute objective evidence that the person or event represented really does exist, or did exist, or really happened at one time when the photographer and her camera was
present. As such, a photograph is a unique kind of signifier. Whereas a painting or a drawing is an icon, which has a resemblance to the object it depicts, a photograph is an index, produced during an encounter with the real world, of which it is a trace or remnant, fixed in permanent form by a chemical process. Because the photographic image is produced as a result of a direct encounter with its referent, we believe that it is less open to manipulation and duplicity than is language. With photographs, our interpretations seem to take place naturally-almost instinctively. We recognize those people, things, places, or events, or we have seen things like them, or believe that things like them could happen. Barthes described photographs as “messages without a code.” By this, Barthes meant that, in and of itself, a photograph is purely descriptive, a dumb recording of objective, physical facts. Barthes termed this level of representation the photograph’s denotative meaning; he argued that, at the most basic level, the photograph is a transcription of the physical and optical experience in the presence of which the photographic exposure was made. Ironically, an objective “description of a photograph is literally impossible,” according to Barthes, because our descriptions are inevitably interpretations.6 When we put words together to describe a photograph, we move to the level that Barthes calls connotation, in which we describe the photograph using words, which, by definition, cannot be objective. As such, our split-second interpretation of the photograph engages what Barthes terms a semiological system, a universe of signs encoded in the photograph, which are decoded by a reader or viewer (that is, they are translated into language) in accord with their own experience.