I want it all, I want it all, I want it all, and I want it now...
The truth is, a lot of archivists don't want it all. I don't really want it all. As Oscar the Grouch once said, "There'll be more trash tomorrow."
I just returned from the 2006 meeting of the Society of American Archivists
(SAA) in Washington, D. C., where I attended a wonderful session this morning entitled "Everyone’s Doing It: What Blogs Mean for Archivists in the 21st Century
There was a lot of thought-provoking discussion provided by the three speakers, Beth Kaplan, Jessamyn West, and Bill Landis about blogging and how to archive blogs (as a side note - the session chair Kathleen Burns was the most well-spoken and enthusiastic chair I saw at the entire meeting). In fact, it was so inspiring that I vowed to devote this evening to resurrecting this blog. Bill Landis, however, really captured my attention, because I think he was able to express my feelings towards all of this digital content and the pressures of trying to preserve it.
He argued that you can't save it all, and why would you want to? Yes, there are unique problems posed by the format of blogs. Jessamyn West explained, for example, that blogs are different for each user. The data is stored in a database and blogs only appear linear because of the software that is used to create them and how the user interacts with the blog at any given time. To preserve a blog in perpetuity is something that I am sure will be very difficult to figure out (although I don't really even understand why this should be the case. Aren't computer programmers supposed to be very smart? Certainly a few of them could work on this for a few days and give us something. Maybe SAA should consider starting a "Programmers Roundtable" and we could have our own programmers to order around as we please...).
Bill Landis argued that maybe you don't really need to save a blog in its entirety. He (rightly) pointed out that many "blogs" really are linear. You could print some of them onto paper and not lose any of the content or context of the original. Others may have comments, but sometimes they may not be worth saving. Someone in the audience asked about archiving blogs and websites and what would you do if a blog was archived for example, but a link to an article at the New York Times
was not saved? And what if the blog author had not indicated on the blog what the said article was about?
Tough. It's called research skills. I'm grateful to that person for asking that question. Why do we feel like we have to spoon feed everything to everyone? Look at a typical diary entry from 1850... (I made this up based on a detailed study of reading peoples' diaries).Mr. A_t called today. I do declare he is the most ridiculous person I've ever known. His breath smelled of whiskey and I know for a fact that he had spent the afternoon in C_ because Lizzy told me so after he had left. Lizzie thought it right amusing that he should have quoted Mr. Z_a at length about the election. We all know that Mr. J has won because no one would vote for the other.
Let's assume that I come across this diary in a manuscript repository and only know that it was kept by a "Jane Smith" who lived in the capital city of my state. Do I know immediately the identities of "Mr. A_t" or where "C_" is located? What election are they talking about and what if Mr. J. didn't really win? Are "Lizzy" and "Lizzie" the same person? If I am lucky, I may be able to extrapolate some context from this entry after some searches of genealogical records and consultation of history books. Can we not safely assume that there may be some way in the future to figure out which article was linked from a blog to the New York Times
based on our knowledge of URL structures and perhaps the date of the posting? Is a link to a newspaper that has disappeared any different than a newspaper clipping that has disintigrated? Just because something is on a blog doesn't mean the writing is any better than it would be otherwise. And if the author is inconsiderate enough to researchers one-hundred years in the future to post a link without providing context about why he or she is linking to that article, then can we just let it go
I think so. All of this talk of saving everything makes me fear for the mysteries of research. The joy of discovery. Don't get me wrong. I think we should use technology to our advantage to help us save and document blogs and other "born-digital" materials. I know that I obsessively save almost all of my emails and even back up the data files in the hopes that someday there will be an easy, transferrable way to save them. But I know that my sister doesn't do that. My father doesn't do that. Most of my friends don't do that. My email may be all that survives. And if that is the case, people may only learn about my acquaintances through me. Even so, the person who researches my fascinating life in 2187 is going to have a lot more material available to them than the person who is trying to write an entire doctoral dissertation based on 23 letters written by a Civil War soldier named "John" to his family.
The researcher of the future is going to have to try to understand our form of communication without perhaps replicating it. I understand enough about 19th Century correspondence to know that sometimes letters were folded and the backs used as envelopes. The 22nd Century researcher will have to know about things like emoticons and avatars and icons and abbreviations like WTF and ROTFLMAO in order to read our correspondence. Misunderstanding of a :) after a nasty-sounding statement could completely alter someone's historical interpretation of a discourse between two people. It is already difficult to tell online whether or not someone is in earnest or passive-aggresive in an online environment. Humor does not always translate and use of slang might be the same for a high school student as it is for a Rhodes Scholar. Researchers will have to understand
this and no amount of replicating an online environment will help them.
History is history. There are all sorts of online emulators available for early video games. A student from the future can play Pacman
and conclude that we all must have been extremely bored and stupid in the early 1980s. What they won't get from that emulator is the thrill that I felt when spending a hot summer's day in my uncle's air-conditioned, cigarette smoke-filled house, concentrating for hours on trying to finish a level of Pacman on his brand new, state-of-the-art ColecoVision machine, hoping to have one-up on my friends the next time I visited an arcade, only to discover that the arcade versions are much harder because it would do the companies no good to have 11-year-olds playing for hours on one quarter. The person using the simulator online in 2187 won't understand the fear in a child's heart when she realizes that there is no change from the $20 her mother gave her to use to pay for a movie and popcorn because she ended up wasting $15 of the money trying to get a gorilla to swing from a rope.
An astute reader of this blog might notice that I listed three participants in this SAA session - Beth Kaplan, Jessamyn West, and Bill Landis. Yet the link I provided to the description at the SAA page lists Catherine O'Sullivan, and not Beth Kaplan, as a speaker. Without me explaining that Catherine couldn't make it and that Beth presented in her place, would the researcher of the future have any clue? How would they know who was right, me or the SAA website? They wouldn't know. They would have to look at other sources, other blogs, etc. to find out. Just like they have to do now when they come across discrepencies and misinformation. It's called research...
And I need to learn how to edit my posts, because I've been told by numerous web designers that people won't read below a page.