Dropping into the “Preserving Liquid Communication” Symposium

On February 11th, I had the pleasure of attending the Preserving Liquid Communication symposium held by the Association of Canadian Archivists chapter at UBC. Here is a summary of Panel 4, titled, “Liquid Communication in Archives: Theoretical Considerations, Tools, Strategies, and Tips from Work in the Field” with panelists Babak Haidzadeh, Patricia Klambauer, and Erin O’Meara.

Patricia Klambauer oversees the tech department at Library and Archives Canada. Her projects involve documenting the 2010 Olympics and federal campaigns. To date, her team has currently harvested 28 terabytes worth of data in website and social media content.

Babak Haidzadeh is a computer engineer and professor at the University of Maryland. His role entails transferring Twitter content to the Library of Congress and devising methods of categorization. Due to the sheer volume of tweets sent out each day (upwards of 500 million), his team focuses on three areas

  1. Conflict (tweets concerning wars, tracking terrorist activity i.e. studying posts from ISIS)
  2. Labour (tweets sent by unions, public relations of corporations)
  3. Mass Media (tweets as a medium for broadcasting news or advertisements)

Erin O’Meara is manager at a private archives for the (Bill) Gates family, philanthropic foundation, and related companies (minus Microsoft).

Many good points were raised by the panel. However, due to the nature of the discussion, the conversation turned very technical at times (there were some very specific terms about computer coding) and became difficult to follow. Here are the key points I’ve gleaned from the dialogue:

  1. Project management strategies

Before beginning any project, always present ideas as a business case. Questions that should be asked include, “Why should we harvest this content? What is the cost? Can we justify the cost? What is the impact of not collecting this?”

  1. Common harvesting tools, and their successes or limitations

Bagger is a software application developed by the Library of Congress that bundles sets of collected files together. However, it can only organize them at the bit and file level—it’s not content or semantic level specific. At Library and Archives Canada, Twark is used to capture databases of tweets, while Offline Explorer Pro is a desktop tool which records live streamed content.

Curating social media content is vastly different from dealing with websites. Each digital object (a tweet, a facebook post) has a completely different code, and it is impossible to create a generate software. There is no “one size fits all” solution and that is what makes the digital archiving so daunting and costly.

  1. The translation of traditional archival practises to a digital environment

Appraisal continues to be an important element of digital archiving. It is vital to identify which level of content archivists are looking for and to set “crawl” parameters. For example, conversations in the form of comments can continue for months and are often very intermittent. Tweets can be retweeted. Archivists now face the challenge of setting boundaries to these ongoing conversations and forging connections between sets of data collected at different times.

  1. Opportunities offered by a liquid archives

The beauty of online content is that each tweet or comment has a digital footprint. Thus, unlike a physical letter, it is almost always possible to trace its path back to its point of origin.

Due to sheer volume of content available, automation has become inevitable, yet it cannot be forgotten that the archives is all about the human. Online archives present a great opportunity for archival organizations to work together strategically and gain wider coverage of materials.

Leave a Reply