As a part of the digitization project for Association of University and College Employees (AUCE) fonds, we digitized an audio cassette tape entitled “The Steward”, which is a speech recording about being a union steward. Today, we will show how we digitize an audio cassette tape.

 

Equipment

To digitize a cassette, we use the following equipment:

  • Cassette tape deck: An ION Tape 2 PC USB cassette deck
  • Audio capture and editing software: We use Audacity, a free, open-source application
  • Computer: A Mac Pro running macOS 10.14.5

 

Damage repair

Before starting the digitization, we had to repair the tape as it was broken (this is common with older cassette tapes), and we took this opportunity to put the tape in a new housing (the original housing is shown above). It is important to have the tape in optimal condition before digitization and preservation.

 

Digitization Process

We followed the sample workflow for tape digitization in the Audacity wiki.

  1. First, the cassette deck (Tape 2 PC) is connected to the Mac to export the audio for digitization. We connected the USB cable directly to a free USB port on the Mac, and turned it on.

USB port is on the left. The Tape 2 PC also has an RCA output.

 

  1. Since we are using a Mac, we needed to set up an audio input to ensure that the Tape 2 PC signal could be picked up by the Audacity software. We set a sample rate of 44100 Hz and 16-bit format which is the standard for CD burning. For more information, please follow the instructions in the Audacity wiki, Mac and USB input devices.
  2. Once all the settings were made, we did a test recording and made sure the levels were correct (i.e. no clipping, a form of sound distortion). We aimed for a maximum peak of -6 dB.


The green bar should not reach more than -6 dB.

 

  1. Then we started the digitization. We played the cassette in the deck first and clicked the recording button in Audacity immediately after. Since we recorded both sides of the tape, we paused the recording after the first side and resumed after switching to the second side.

Cassette is played for digitization.

Audacity interface on the Mac.

 

Exporting a file for access and preservation

Once the tape is digitized, we exported the file in WAV format. WAV with linear (uncompressed) PCM is a preferred and recommended format for long-term preservation. Once we upload it to our content management system, we will digitally preserve it with Archivematica.

For access purposes, we converted the WAV file to MP3 format. MP3 is a compressed audio file which is widely supported and playable on nearly all devices with a more manageable file size.

Once metadata is created for the exported file, the audio will be ready to upload.

Please find the recording on UBC Rare Books and Special Collections’ Access to Memory (AtoM) database. The audio will soon be available in Open Collections!

 

See also

Microforms are reduced-size copies of documents used for access and preservation. There are a few different formats of microforms, the most popular being microfilm (film reels) and microfiche (flat film sheets). This post focuses on how we digitize microfilm.

Microfilm reel

 

At the Digitization Centre, we have digitized newspaper microfilms using our flexScan equipment. Although microfilm is a relatively stable format for preservation purposes, digitization increases access to those materials. Thanks to microfilm digitization, the BC Historical Newspapers collection is fully accessible (and searchable) online, without the need for specialized equipment like a microfilm reader.

flexScan equipment and workstation

 

To digitize a roll of microfilm, it must first be installed on the flexScan machine. The film has to be woven through precisely, as shown here:

Then, the digitizer adjusts several settings on a computer connected to the flexScan. These include the width of the film (16 or 35 mm) and polarity (negative or positive). Most of the microfilms we have digitized are 35 mm negatives.

One tricky setting to get right is the “reduction ratio”. The reduction ration is the ratio of the original newspaper size to the size of the newspaper on the film. So, if the original newspaper was 430 mm high, and the image on the film is 30 mm high, the reduction ratio would be 430 mm / 30 mm ≈ 14.5. This means the original newspaper was shrunk by a factor of 14.5 on the microfilm.

The reduction ratio is important because it helps us approximate the “true DPI” of the image. DPI stands for “dots per inch”.  To calculate out the “true DPI” of the microfilm (how many dots per inch on the film itself), we multiply the approximate DPI of the newspaper (300 DPI) by the reduction ratio. Therefore, in this example, the “true DPI” is 300 DPI x 14.5 = 4350 DPI. This number tells the digitizer how to set the height of the scanner’s sensor.

After configuring these settings and adjusting the sensor height, it’s time to focus! Pressing a button on the computer interface begins slow, incremental movements of the film reel.

In between each advancement of the reel, the digitizer adjusts the camera lens, monitoring the image on the screen until it looks crisp.

Focusing the image

 

After focusing, there are a couple more settings to be adjusted related to lighting and exposure. Then, it’s time to scan!

Once scanning has started, the digitizer can monitor the images produced as they scroll by, pausing to adjust settings as needed:

Monitoring the scanning process

 

After scanning is complete, the digitizer opens a program called the “Auditor”. This program automatically detects the boundaries of each page; however, it requires some manual adjustment on the part of the digitizer. The screen looks like this:

Adjusting the boundaries of each page

 

In the image above, the blue boxes represent confirmed pages, and the yellow and red boxes show issues that need to be manually adjusted. Once everything has been adjusted, the portions inside the boxes can be output into TIFF files.

Interested in our digitization processes and equipment? Check out these previous blog posts on our other scanning equipment, as well as many more behind-the-scenes posts under the How We Digitize tag:

The UBC Digitization Centre is responsible for the creation of more than 50 collections, all available through the Open Collections website. Our collections are diverse in formats, information and languages.

Having non-English materials, or materials that are not written using the Latin-based alphabet, may be a barrier to access and retrieving information. But technology can be used to help us minimize these barriers.

Laura Ferris and Rebecca Dickson, from the Digitization Centre, have discovered a process to generate searchable transcripts for non-Latin text. The idea originated from an article about a workshop on Optical Character Recognition for Bangla. The result of the workshop was the realization that Google Drive was the most accurate tool for generating transcripts for non-Latin text.

With that information in hand, Ferris and Dickson started to explore Google Drive to create an automated workflow for transcribing batches of items.

Are you interested in trying the workflow out for yourself? If so, check the instructions that Rebecca prepared and give it a try!

  1. Access Google Drive, create a “New folder” and rename it
  2. Create a Google Sheet inside the folder
  3. Open the Sheet, click on “Share”, “Receive shared link” and look for the sheet identifier (the numbers and letters between /d/ and /edit?)
  4. In the Sheet, under “Tools” menu, click “Script editor”
  5. Paste the content from “gs” into the script editor
  6. Update the “folderName” with the name of your folder (defined in step 1)
  7. Update the “sheetId” with the identifier that you found in step 3
  8. Click the “clock” icon and select the options: “extractTextOnOpen”, “From spreadsheet” and “On open”
  9. Save the script editor and close it
  10. Upload jpegs to the folder (you can check out the sample items prepared for this work)
  11. Open the spreadsheet and wait for Google to do the work!

 

If you want to check Laura and Rebecca’s presentation about the topic, check out their slides. If you have questions, feel free to contact us.

 

Sources:

A workshop on Optical Character Recognition for Bangla (British Library)

OCR for non-English language text (Pixelating)

Pixelating-ocr (GitHub)

The Berkeley Poster Collection, housed at the UBC Library’s Rare Books and Special Collections, contains 250 posters created between 1968 to 1973 which document the advocacy and activism of student groups during the Vietnam War era. These posters attest to the tense political climate present in the United States and South East Asia during that time and the efforts of underground and guerilla groups to tap into the social conscience, pressing for greater awareness and public concern regarding the Vietnam War.

cdm-berkpost-1-0001258full

“Peace on Nixon”, 1973

cdm-berkpost-1-0001252full

“USA Stop Policing the World”, 1973 (with perforated edges of printer paper visible)

cdm-berkpost-1-0001429full

“America is a Democracy…”, 1973

At the Digitization Centre we frequently revisit and assess the quality of our digitized collections. As time passes our capacity to produce higher-quality digital images often improves due to newer equipment or scanning techniques. In the case of the Berkeley Poster Collection the images currently available through Open Collections were originally scanned in 2009. It is therefore not surprising that our facilities and equipment have changed so significantly that we’re now revisiting this collection to improve upon the current digital images we have!

20161018_112311

A digitization student prepares posters for rescanning

Additionally, a large number of the posters were printed on discarded computer paper which was repurposed for the posters. A significant portion of these pages have computer code and data on the verso of the poster images – information which was not included in the original digital images but which has now been deemed important enough to include in this new round of scans. This type of “ephemera” not only offers insight into the type of work that early computers were doing at Berkeley in the sixties and seventies but also provides contextual information which situates this collection in a very specific time and location.

20161018_112416

Verso of poster containing computer code

20161018_111314

“America Saves the World”, 1973, ready to be rescanned

Sometimes it can be a challenge to assess all of the possible “values” that a historical item may have which is why it is so important to revisit and reassess digitized collections over time.

If you would like to browse the Berkeley Poster Collection, click here. To learn more about the equipment that we are using to rescan the posters check out this previous blog post on the topic!

We often write about collections that have already been digitized, but today we want to give you a sneak peek of a forthcoming collection that we’re working on right now.

The BC Historical Documents are a variety of papers, correspondence and text that have been identified as being representative of the documentary history of early British Columbia. These documents highlight the growth and development of BC over time, and feature some key figures in our social and political history. This collection is made up primarily of personal papers, letters, photos and ledger books, as well as a number of educational records such as curriculums and class lists.

20160726_103341

Two graduate students from UBC’s School of Library & Archival Studies are working on digitizing these records and adding metadata to them. Through this work, both have had the opportunity to interact with rare and interesting materials, including police reports, yearbooks and personal letters. In one instance, a set of yearbooks from the Provincial Normal School shows the direct impact of World War I, with the 1914/1915 graduating class being half the size of the previous year, and the 1915/1916 yearbook documenting former students who had gone to war, as well as those that had passed away.

A number of correspondence from noted politician and 12th premier of BC, Charles Semlin, demonstrate the complex balance between private and public life that political figures often must negotiate. In Semlin’s case, he was known as a conservative politician interested in curbing immigration from Asia and implementing wide-ranging reforms. Despite his divisive political leanings, however, Semlin was a source of financial support for numerous friends and acquaintances throughout his life, a fact well documented in his correspondences.

Across these historical documents, it is possible to gain greater perspective and appreciation for the many components which have contributed to the building of our province, and the variety of stories that make this place unique.

Stay tuned for more information about the Early BC Historical Documents collection!

bible1

We’ve got another new (but actually really really old) addition to our digital collection. We’re excited to share that we have digitized a rare Latin Bible from the 13th century! You can check it out in out Western Manuscripts collection where many of our oldest books live.

 

bible2

The pages are made from vellum or dried calf skin as most books were at that time.

This Bible is an amazing addition to our collection for a few reasons. First, it was a Student Bible made in Oxford England around 1250 AD, something that at the time was pretty remarkable. Back then most Student Bibles were produced on the continent, typically in Paris, for university pupils and professors who used them for their studies. This makes our Bible unique – and the only one like it in a Canadian collection.

bible5

This book contains a fair amount of marginalia! Check out all the faded notes on the side.

A second special aspect of this Bible is the concordance at the end of the book. The concordance, pictured below, is an index created for the Bible on where to find certain words or phrases within the book.

bibile3

Click here to see the concordance for yourself!

One of the early owners created this concordance shortly after the book was finished. The concordance is obviously not part of the original book. We don’t know exactly when or who created it – and if any of you scholars out there want to try to find out, take a shot and let us know about it! We wholeheartedly support you!

Even you are not a scholar take a look at the book for yourself, or take a look at the UBC press release on this book. It might make you into a bibliophile!

Digitization of BC Sessional Papers, from 1933-1952,
 is on its way.

Phase 3 of Sessional Papers has been approved and digitization will start this summer! This phase will look at 41 bound volumes from the British Columbia Sessional Papers. It will increase our current collection by 19 years – and as an added bonus there will be fold out maps and charts to check out.

cdm.bcsessional.1-0064155.0002full

More maps like this are coming to you soon!

The Sessional Papers are important provincial legislative documents that capture the economical, historical, political, and cultural atmosphere of British Columbia history. The Sessional Papers include official committee reports, orders of the day, petitions and papers presented, records of land sales, correspondence, budgetary estimates, proclamations, maps, voters lists by district, and departmental annual reports.

cdm.bcsessional.1-0062893.0000full

There’s tons of historical content! – For a belated celebration of International Women’s Day – Sessional papers has women petitioning for the vote in Canada

Click here to visit our digital collections page to view the volumes we have digitized.

Click here to read more about what sessional papers are and how they can be utilized for research.

cdm.bcsessional.1-0061315.0000full

Right now digitized content in Sessional Papers runs from 1878 to 1931

cdm.bcsessional.1-0059851.0012full

You can find all sorts of things in Sessional Papers – take a look now and keep your eyes peeled for more coming soon!

a place of mind, The University of British Columbia

UBC Library

Info:

604.822.6375

Renewals: 

604.822.3115
604.822.2883
250.807.9107

Emergency Procedures | Accessibility | Contact UBC | © Copyright The University of British Columbia

Spam prevention powered by Akismet