Thursday, June 21, 2012

On Hypertext

I'm digging up some posts from previous blogs I've written that I think still have value. This is the first in a series. Originally published in 2011.

In reading Vannevar Bush's 1945 piece, As We May Think, in which he is credited for conceptualizing hypertext "associative indexing," I am struck by one crucial difference between the "trails" of his memex and the hyperlinks of the web circa 2011. In the memex, a trail could be created by the reader as he uncovered interesting links. These trails were primarily there for his own recollection, although they could be shared with others by explicitly exporting them. Content could be published with links embedded by the authors, but they could just as easily create their own that made sense in their specific context. Along with associative links between published documents, readers could contribute their own analysis, excerpts, and annotations in situ, so that a trail captured an association of thought more completely.

In today's web, links are the same for every every reader, placed only by the author. Some pages have a mechanism for reader-created links by means of comments, which invariably come at the bottom of the article, and which are frequently nothing more than spam. In any case, these are public, and difficult for the reader to later reference. Methods for annotation and recording thoughts are crude, through the use of blogs, microblogs (like this one), and social media. In this way, individual links can be shared with others, but the concept of creating custom "trails" of associated content is nascent at best, and fully unrealized at worst.

Bush also talks of people who would actively create these links - "trail blazers." In today's web terms, we would call these people curators or mavens - surfacing the best content through their Twitter streams or blogs. However, these sharing mechanisms are still treating content atomicly, and interestingness has a tendency to surface either very timely or very weird content, such that anyone following a stream of links from a popular aggregation source is likely to still feel lost in a sea of lolcats and headline news. Whether this surfacing is done by an individual or a community, such as reddit, it is still subject to a crude, partial approximation of Bush's vision.

The closest approximation to Bush's memex I can think of are wikis. They consist of vast collections of tightly hyperlinked content, due in part to a culture of what I'll call optimistic linking. That is, since they have a controlled domain and others can author content independently, an author can make any given term into a link. If a document exists for that link, then it is automatically referenced. If it does not exist, another author can come along and create that content. Wikis can range in scope from the "sum of all human knowledge" in the case of Wikipedia, to more narrowly focused around a given community or project, for example sites like pbwiki or wikia, which encourage the creation of new, topical wikis.

However, these still fall short of Bush's ideal. They primarily link to internal content. While they can link to external content, they cannot retain context of linking to an arbitrary point within a page, and - more importantly - they can only link one level deep. After a wiki reader clicks through to an external link, she is not longer on the wiki, and she can no longer create a link to an arbitrary associated page.

It is difficult to add annotations or new analysis. While most wikis are set up to allow anyone to contribute by creating and editing pages, they have a culture of preferring "content" rather than "metacontent". Since any edits made by one author are visible to all, there is a certain self-conscious act of "publishing" involved in editing a wiki. Further, an authors edits can be undone or refined by other authors, making a wiki non-permanent and poorly suited for an author's individual note-taking for later recall.

Is there a need for these additional capabilities? I think it depends on our aims in reading and processing the information we find online. If we are merely flitting from one article to the next, possibly stopping to blog or tweet or comment about it along the way as an aside, never to return or further synthesize the information, then our current scheme is sufficient. If, however, we use the web with a more knowledge- and research-centric orientation, then we will soon find it lacking, particularly for personal recall.

This is the aspect of hypertext that Bush most emphasizes, and one which I think is underrealized and potentially quite valuable. He characterizes his hypertext memex machine as "an enlarged intimate supplement to his memory." For simple facts and figures, a generic web search may prove to be immediate enough. "What's the movie with Tom Hanks and the volleyball?" for example. But for more complex intellectual concepts, a less transient, more personal means of assisting research, recall and serendipitous discovery of information is necessary.

Nicolas Carr excoriates hypertext - and computers generally - in his work "The Shallows," claiming that hyperlinks contribute to shorter attention spans and flighty trains of thought. But he's making observations on the current implementations of hypertext. I take issue with his conclusions even on the basis of the current state of the web in 2011, even given the current state of hypertext. But given the addition of certain key conceptual features, ad-hoc hypertext has the promise of greatly increasing our ability to leverage, process, and synthesize the vast swaths of written knowledge. We must invent the tools to enable us to blaze trails across terra semicognita.

No comments:

Post a Comment