The UBC Digitization Centre is responsible for the creation of more than 50 collections, all available through the Open Collections website. Our collections are diverse in formats, information and languages.

Having non-English materials, or materials that are not written using the Latin-based alphabet, may be a barrier to access and retrieving information. But technology can be used to help us minimize these barriers.

Laura Ferris and Rebecca Dickson, from the Digitization Centre, have discovered a process to generate searchable transcripts for non-Latin text. The idea originated from an article about a workshop on Optical Character Recognition for Bangla. The result of the workshop was the realization that Google Drive was the most accurate tool for generating transcripts for non-Latin text.

With that information in hand, Ferris and Dickson started to explore Google Drive to create an automated workflow for transcribing batches of items.

Are you interested in trying the workflow out for yourself? If so, check the instructions that Rebecca prepared and give it a try!

  1. Access Google Drive, create a “New folder” and rename it
  2. Create a Google Sheet inside the folder
  3. Open the Sheet, click on “Share”, “Receive shared link” and look for the sheet identifier (the numbers and letters between /d/ and /edit?)
  4. In the Sheet, under “Tools” menu, click “Script editor”
  5. Paste the content from “gs” into the script editor
  6. Update the “folderName” with the name of your folder (defined in step 1)
  7. Update the “sheetId” with the identifier that you found in step 3
  8. Click the “clock” icon and select the options: “extractTextOnOpen”, “From spreadsheet” and “On open”
  9. Save the script editor and close it
  10. Upload jpegs to the folder (you can check out the sample items prepared for this work)
  11. Open the spreadsheet and wait for Google to do the work!

 

If you want to check Laura and Rebecca’s presentation about the topic, check out their slides. If you have questions, feel free to contact us.

 

Sources:

A workshop on Optical Character Recognition for Bangla (British Library)

OCR for non-English language text (Pixelating)

Pixelating-ocr (GitHub)

The Berkeley Poster Collection, housed at the UBC Library’s Rare Books and Special Collections, contains 250 posters created between 1968 to 1973 which document the advocacy and activism of student groups during the Vietnam War era. These posters attest to the tense political climate present in the United States and South East Asia during that time and the efforts of underground and guerilla groups to tap into the social conscience, pressing for greater awareness and public concern regarding the Vietnam War.

cdm-berkpost-1-0001258full

“Peace on Nixon”, 1973

cdm-berkpost-1-0001252full

“USA Stop Policing the World”, 1973 (with perforated edges of printer paper visible)

cdm-berkpost-1-0001429full

“America is a Democracy…”, 1973

At the Digitization Centre we frequently revisit and assess the quality of our digitized collections. As time passes our capacity to produce higher-quality digital images often improves due to newer equipment or scanning techniques. In the case of the Berkeley Poster Collection the images currently available through Open Collections were originally scanned in 2009. It is therefore not surprising that our facilities and equipment have changed so significantly that we’re now revisiting this collection to improve upon the current digital images we have!

20161018_112311

A digitization student prepares posters for rescanning

Additionally, a large number of the posters were printed on discarded computer paper which was repurposed for the posters. A significant portion of these pages have computer code and data on the verso of the poster images – information which was not included in the original digital images but which has now been deemed important enough to include in this new round of scans. This type of “ephemera” not only offers insight into the type of work that early computers were doing at Berkeley in the sixties and seventies but also provides contextual information which situates this collection in a very specific time and location.

20161018_112416

Verso of poster containing computer code

20161018_111314

“America Saves the World”, 1973, ready to be rescanned

Sometimes it can be a challenge to assess all of the possible “values” that a historical item may have which is why it is so important to revisit and reassess digitized collections over time.

If you would like to browse the Berkeley Poster Collection, click here. To learn more about the equipment that we are using to rescan the posters check out this previous blog post on the topic!

We often write about collections that have already been digitized, but today we want to give you a sneak peek of a forthcoming collection that we’re working on right now.

The BC Historical Documents are a variety of papers, correspondence and text that have been identified as being representative of the documentary history of early British Columbia. These documents highlight the growth and development of BC over time, and feature some key figures in our social and political history. This collection is made up primarily of personal papers, letters, photos and ledger books, as well as a number of educational records such as curriculums and class lists.

20160726_103341

Two graduate students from UBC’s School of Library & Archival Studies are working on digitizing these records and adding metadata to them. Through this work, both have had the opportunity to interact with rare and interesting materials, including police reports, yearbooks and personal letters. In one instance, a set of yearbooks from the Provincial Normal School shows the direct impact of World War I, with the 1914/1915 graduating class being half the size of the previous year, and the 1915/1916 yearbook documenting former students who had gone to war, as well as those that had passed away.

A number of correspondence from noted politician and 12th premier of BC, Charles Semlin, demonstrate the complex balance between private and public life that political figures often must negotiate. In Semlin’s case, he was known as a conservative politician interested in curbing immigration from Asia and implementing wide-ranging reforms. Despite his divisive political leanings, however, Semlin was a source of financial support for numerous friends and acquaintances throughout his life, a fact well documented in his correspondences.

Across these historical documents, it is possible to gain greater perspective and appreciation for the many components which have contributed to the building of our province, and the variety of stories that make this place unique.

Stay tuned for more information about the Early BC Historical Documents collection!

bible1

We’ve got another new (but actually really really old) addition to our digital collection. We’re excited to share that we have digitized a rare Latin Bible from the 13th century! You can check it out in out Western Manuscripts collection where many of our oldest books live.

 

bible2

The pages are made from vellum or dried calf skin as most books were at that time.

This Bible is an amazing addition to our collection for a few reasons. First, it was a Student Bible made in Oxford England around 1250 AD, something that at the time was pretty remarkable. Back then most Student Bibles were produced on the continent, typically in Paris, for university pupils and professors who used them for their studies. This makes our Bible unique – and the only one like it in a Canadian collection.

bible5

This book contains a fair amount of marginalia! Check out all the faded notes on the side.

A second special aspect of this Bible is the concordance at the end of the book. The concordance, pictured below, is an index created for the Bible on where to find certain words or phrases within the book.

bibile3

Click here to see the concordance for yourself!

One of the early owners created this concordance shortly after the book was finished. The concordance is obviously not part of the original book. We don’t know exactly when or who created it – and if any of you scholars out there want to try to find out, take a shot and let us know about it! We wholeheartedly support you!

Even you are not a scholar take a look at the book for yourself, or take a look at the UBC press release on this book. It might make you into a bibliophile!

Digitization of BC Sessional Papers, from 1933-1952,
 is on its way.

Phase 3 of Sessional Papers has been approved and digitization will start this summer! This phase will look at 41 bound volumes from the British Columbia Sessional Papers. It will increase our current collection by 19 years – and as an added bonus there will be fold out maps and charts to check out.

cdm.bcsessional.1-0064155.0002full

More maps like this are coming to you soon!

The Sessional Papers are important provincial legislative documents that capture the economical, historical, political, and cultural atmosphere of British Columbia history. The Sessional Papers include official committee reports, orders of the day, petitions and papers presented, records of land sales, correspondence, budgetary estimates, proclamations, maps, voters lists by district, and departmental annual reports.

cdm.bcsessional.1-0062893.0000full

There’s tons of historical content! – For a belated celebration of International Women’s Day – Sessional papers has women petitioning for the vote in Canada

Click here to visit our digital collections page to view the volumes we have digitized.

Click here to read more about what sessional papers are and how they can be utilized for research.

cdm.bcsessional.1-0061315.0000full

Right now digitized content in Sessional Papers runs from 1878 to 1931

cdm.bcsessional.1-0059851.0012full

You can find all sorts of things in Sessional Papers – take a look now and keep your eyes peeled for more coming soon!

Forget watching Star WarsAvengers, and Lord of the Rings on your cellphone– if you are looking for a larger-than-life story delivered to you in a small container check out our newly digitized epic poem Orlando Furioso in Western Manuscripts. The full size of the book is only 11 by 5 cm.

This preciously small package packs a punch though! Orlando Furioso is an Italian epic poem written in 1516. With 46 cantos (or chapters) this is one of the longest poems in literature. Our version, one of the earliest, was published in 1577.

questing

Orlando Furioso – when translated in to French became “Roland” – so a more apt translation of the title into English is “Raging Roland”

The poem follows Orlando, a singular knight involved in the war between Charlemagne’s Christians and the Saracen army that attempted to take over Europe. The setting ranges over the whole world, with a trip to Hell and the moon thrown in! As befitting any epic there are also soldiers, sorcerers, gigantic sea monsters, and even a hippogriff.

moon

This is the Canto where the main characters go from Hell to the moon. Hard to tell which one it is from this picture!

The poem focuses romantic chivalry, especially on Orlando’s love for a princess, which among other things drives him into a mad killing frenzy – romantic enough for Valentine’s day?

ladyknight

A female knight is also one of the main character of the poem. Here she she is taking down a foe!

For us the tiny, tightly bound book was a challenge to digitize. Not only was it old, small, and fragile- the print often goes very close to the center binding, making it difficult to get a complete picture of for digitization.

cdm.manuscripts.1-0223847.0159,cdm.manuscripts.1-0223847.0160_full

Can you spot the sea monster in this canto?

However here at the Digitization Centre we are nothing if not dogged in our pursuit of world digitization. To bring this epic poem to you in a digital format we used our ATIZ machine, shifting the book cradle from side to side as we digitized. It may have taken a few tries and a long while but, and this is a direct quote from our main digitizer, Leslie Fields “all in all it was really worth it”

So check it out for your self to see what all the fuss is about!

a place of mind, The University of British Columbia

UBC Library

Info:

604.822.6375

Renewals: 

604.822.3115
604.822.2883
250.807.9107

Emergency Procedures | Accessibility | Contact UBC | © Copyright The University of British Columbia

Spam prevention powered by Akismet