Skip navigation

When we think of archives in the modern day, we think of digital records and digital record-keeping. We have moved from using paper, to digitizing it, to more recently, creating records digitally (or what is called born-digital records).

In this post, I focus on mass digitization and choose to point out the issues faced by organizations when doing so. The starting point for my topic is an article, entitled “Access and Preservation in Mass Digitization Projects.”, by John Yolkowski and Krista Jamieson.

In the article, the authors discuss a mass digitization project at Dalhousie University. The university was given a certain amount of money by a donor to digitize a fonds. The project had to be complete within a year’s time.

The article itself mentions with honesty the issues and complications the project underwent. The article allowed the reader to look beyond these issues to broader points of discussion in the field of archives and records management.

Mass digitization projects require time, labour and a specific set of technical equipment with the capacity to process mass records. In many cases, records need to be digitized one at a time in order not to compromise the quality of the resulting document. This is a long process and a very time-consuming one, and we notice how institutions hire temporary staff nowadays for digitization projects. Permanent staff might not be enough, especially when projects have a relatively short deadline.

Scanners with automatic document feeders can help, but sometimes original documents are old and fragile. Handling them is a delicate job, and exposing them to light can accelerate their deterioration.

Studying the specific goals and aims of the projects can help set priorities. Are we digitizing the whole of the records? Among these records, what is of most relevance to users? Who determines this relevance and based on what?

Going back to the idea of not compromising the quality of the images, digitizing mass records results in mass image file sizes. In order to have a quality, representative image of the digitized document, scanning in high resolution is important. Organizations leading such projects must have enough storage capacity to handle massive image files.

Then, we think of accessibility. As one of the end-goals of archives is accessibility and ultimately reference, how can users access such big files? Compressing them for user accessibility is an option commonly taken, but might not be useful to some users. Depending on the reason for the record consultation, some users need to see the smallest details of the records. These details often disappear when the document is compressed. Finding the most suitable file format for users is also a point of discussion. It is related to quality, file size and ease of accessibility for the user.

Another facet of complication with accessibility is copyrights. This is a broader issue, not necessarily specific to mass digitization. However, it is worthy of mention in this case because the more records we have to make accessible, the more copyrights we need to seek approvals on. The question with mass records is if it is possible to attain approvals on all of them. In the case of the Dalhousie project mentioned above, the university took the risk of publishing certain documents without prior approval. They made a risk assessment strategy and placed forms online for right holders to file claims.

Similar to all other projects, time and budget are two main factors controlling the overall running of mass digitization projects. Finding a good balance between them is key to the success of the project.

Each of the issues mentioned in this post, can be a topic of discussion by itself. These issues are interrelated in a sense that one of them can lead us to talk about the other ones. We are discussing scalability, technicality, copyright, accessibility, time management and risk assessment.

Are institutions spending an excessive amount of time digitizing and processing these mass records when file formats will become potentially obsolete? Can manpower in this kind of activity be replaced with technology in the future? And how can institutions find better ways for managing their time when it comes to dealing with mass records?

Leave a Reply

Your email address will not be published. Required fields are marked *

Spam prevention powered by Akismet