A Brave New World: The New Semiotics of the WWW

I just finished validating a classmate’s proofreads on our Wikisource page. As I was working, I found myself wondering how all the {{}} and ‘’’’ and |’s of markup language impact the original text, if at all. Does mechanical language of the internet detract from the figurative language of the author? Does it complement it? Does it matter at all if you can’t see it, anyway? The pervasive use of Wikisource, Wikimedia, and the web generally to access to texts suggests a resounding “yes”.

While working with The Tales of John Oliver Hobbes I find myself preoccupied with dashes, italics, and spaces, rather than the actual text. I just proofread thirty pages of OCR and validated another thirty, and don’t know much about the story other than that there is a Sophia and a Wrath, and that their’s is a tumultuous love affair. But that’s all. My mind was not paying attention to the details of figurative language, or trying to interpret meaning or historical significance. Rather, all my attention was focused on a realm of mechanics I pretty much had no idea existed until settling in to work on this assignment. I wonder what Ferdinand De Saussure would have to say about internet markup language? The “signifier” and the “signified” are concretized, and accuracy of language becomes paramount because the computer doesn’t abstract or interpret the way we (humans) do. If we worry about the word “tree” and why it means wood, leaves, and sticks, then we should also consider why {{nop}} means “add a gap”.

Ed Folsom discusses how database deconstructs traditional narrative, and in turn creates the genre of the twenty-first century. Responses to Folsom suggest that rather than deconstruct narrative, database enhances it, and complicate linearity. Similarly,  I think that databases, such as Wikisource and Wikimedia, cause our understanding of language to evolve. Of course, in traditional written work the editor comes in and slashes and changes and demands, but presumably to help the text aesthetically, to improve its quality for the reader. We are doing something similar when editing OCR, but in our pre-annotation phase it is purely mechanical— not based on understanding characters or tone etc. As scholars, we know that commas and dashes aren’t merely mechanical. They are important aspects of the text that cannot be ignored, so the language that renders the text mechanically “correct” (or at least reflective of its original form) is of equal importance.

The way we interact with original texts via the internet is changing, and in some cases the manipulation of the language is utterly invisible. In response to Folsom’s assertion, Jerome McGann says “The point is that all our documents are always multiply coded and that scholarship preserves and studies the multiple meanings”(1589). Markup language is clearly a form of coding, one dependant on accuracy and understanding of how the internet works. How will we  examine the relationship between mechanical correction of OCR and literary language?  Rather than look at typical literary tools, like imagery and characterization, markup language is a new way to read the text in a contemporary setting, and should not be ignored. Seeing beyond the visual and looking at the mechanics of databases is essential because it is what shapes our understanding of online texts.

So, in my infantile relationship with internet mechanics, I wonder how this relatively new form of language will add layers to texts, requiring as much rigourous academic attention as the original words do.

Do you think this matters, or am I being pedantic? “Every bone in my body fairly aches with reason!”

This entry was posted in Week 4: Text, Wikisource, Wikimedia, WWW and tagged , , , , , , , . Bookmark the permalink.

3 Responses to A Brave New World: The New Semiotics of the WWW

  1. steven.jankowski says:

    I wonder how this relatively new form of language will add layers to texts

    I think the question here has some interesting implications. However, before engaging with it directly, I have some reservations about the usage of the word “new”. If we claim that this is a “new form of language”, we are enacting a particular form of historical amnesia.

    To illustrate my point, we can turn to the printing practices that were prevalent for the better part of 500 years. Preparing an author’s text for publishing was quite literally a mechanical operation, placing tiny pieces of type into a composing stick. As well, the idea of marking up (markup) for format and style (not semantics) has been integral to the publishing of a text. In this clip at 2:00, you can see them talk about the “markup man” who makes typographic choices about how the text is to be displayed. Then the spec’d manuscript is passed onto the machine operators. These activities are in many ways, similar in intention to our activities on wikisource. In this respect, coding a text is by no means new.

    • laura.chapnick says:

      Combining both of Margaret and Steven’s ideas, I think that it is significant to consider the interrelation between all of these elements. It is important to acknowledge the history of something like “markup”; however, how does this history contribute to the the interplay between markup language and poetic or rhetorical devices? Thinking about the text, how it is coded, and how it relates back to another literary period requires a heightened form of attention from the reader. He or she must make connections not only between symbols, motifs or allusions but to the production of the work itself. For this reason, could we classify the online edition as inherently self-conscious?

  2. Pingback: Hidden Narrative: The Text behind the Text | The Tales of John Oliver Hobbes

Leave a Reply

Your email address will not be published. Required fields are marked *