About | Project Members | Research Assistants | Contact | Posting FAQ | Credits

Announcement: Text Encoding (Markup)

Contemporary text-encoding and text-markup approaches. (See also Historical Encoding & Formatting Inventions)

Meaningful Machines

New York-based corporation that aims to improve translation technology.

“Based on a core technology that understands natural language, Meaningful Machines is opening new avenues in text mining, search and retrieval, machine translation, natural language interfaces and artificial intelligence.” (From the home page)

“Running software that took four years and millions of dollars to develop, Carbonell’s marchine—or rather, the server farm it’s connected to a few miles away—is attempting a task that has bedeviled computer scientists for half a century. The message isn’t encrypted or scrambled or hidden among thousands of documents. It’s simply written in Spanish.” (From Wired Magazine.)

Starter Links: Meaningful Machines | Evan Ratliff’s “Me Translate Pretty One Day,” in the Dec. 2006 issue of Wired Magazine

Giselle, Beiguelman, “esc for escape” (2004) Transliteracies Research Report

Online art exhibition that archives error messages from users around the globe.

“esc for escape begun in 2000. It was part of <Content = No Cache>. By that time I invited people to submit error messages asking them: Have you ever read something scary on your screen? Do you understand why programmers suppose they are programming for programmers? Do you fear error messages? I collected these messages for one year. Nevertheless, new operating systems and new forms of connection, inspired me to redesign the project and to update it to Windows Xtra Problems, OS X bugs and to give it a different format (a teleintervention + a DVD documentary).” (from the project’s “Book of Errors” page.)

Starter Links: “esc for escape” | the artist’s home page

Transliteracies Research ReportTransliteracies Research Report By Lisa Swanstrom

Wiki markup

Wiki markup is a curious text encoding phenomenon: every wiki has it, and they all use markup to produce essentially the same results, but many wikis uses different symbols to produce the same effects. In MediaWiki one surrounds their sub-headings with = like so: =subheading1= or ==subheading2==. The same effect is had in PBWiki with this syntax: !subheading1 or !!subheading2. Similarly, an editor of MediaWiki designates bold words by surrounding them with three apostrophes, while PBWiki editors use two asterisks. Both wikis are coding for the same end result but using different symbols to get the same effect.

Starter links: MediaWiki markup syntax | PB Wiki markup syntax | Meatball wiki’s attempt to address this problem

Tom Jennings, “ASCII: American Standard Code for Information Infiltration”

This document by Tom Jennings describes a history of ASCII (the American Standard Code for Information Interchange) and its immediate ancestors including FIELDATA, ITA2, Murray’s telegraphy code, Baudot’s telegraphy code, and Morse’s telegraphy code. This history provides a thorough foundation for how ASCII came to be and serves as a basis for understanding electronic communication.

This research isn’t a detailed history of the development of character codes per se, but of the codes themselves and their specific meanings.

http://www.wps.com/projects/codes/

Text-Encoding Initiative Standard (TEI)

Basic concept of TEI (and of text-encoding in general) as a markup approach to digitizing literary and other texts:

“The TEI was founded in 1987 to develop guidelines for encding machine-readable texts of interest in the humanities and social sciences. Its work was supported by the Association for Computing and the Humanities, the Association for Literary and Linguistic Computing, and the Association for Computational Linguistics, and received generous grant funding from the Mellon Foundation, the EEC, the National Endowment for the Humanities, and other institutions. The “P3” Guidelines were delivered in 1994, and have become the de facto standard for encoding of literary and linguistics texts, corpora, and the like.” (from the TEI FAQ)

Starter Links: TEI home | See also Michael Sperberg-McQueen’s “A Gentle Introduction to SGML” for an overview of the text-encoding or text markup concept

XMLTransliteracies Research Report

Basic concept and implications of XML (and markup language approaches in general):

“Extensible Markup Language (XML) is a simple, very flexible text format derived from SGML (ISO 8879). Originally designed to meet the challenges of large-scale electronic publishing, XML is also playing an increasingly important role in the exchange of a wide variety of data on the Web and elsewhere.” (from W3C Page on XML)

Starter Links: W3C Page on XML | Wikipedia article | See also Michael Sperberg-McQueen’s “A Gentle Introduction to SGML” for an overview of the text-encoding ot text “markup” concept

Transliteracies Research ReportTransliteracies Research Report By Marc Breisinger