Today, Canada’s three federal research funding agencies—the Canadian Institutes of Health Research (CIHR), the Natural Sciences and Engineering Research Council of Canada (NSERC), and the Social Sciences and Humanities Research Council of Canada (SSHRC) (the agencies)—have developed a draft Tri-Agency Research Data Management Policy.

 

The draft policy aims to support Canadian research excellence by fostering sound digital data management and data stewardship practices with suggested requirements related to three primary areas:

 

  1. Institutional data management strategies
  2. Researcher data management plans
  3. Data deposit

 

Based on feedback received from institutions, associations, organizations and individuals on the draft policy and its usefulness in advancing data management practices in Canada, the three agencies plan to launch the Tri-Agency Research Data Management Policy in 2019.

 

The feedback period is open until August 31, 2018.

 

Read the draft Tri-Agency Research Data Management Policy and FAQs

 

Explore The Tri-Agency Open Access Policy: How the UBC Library Can Help guide for UBC researchers

 

 

 

 

 

The UBC Digitization Centre is responsible for the creation of more than 50 collections, all available through the Open Collections website. Our collections are diverse in formats, information and languages.

Having non-English materials, or materials that are not written using the Latin-based alphabet, may be a barrier to access and retrieving information. But technology can be used to help us minimize these barriers.

Laura Ferris and Rebecca Dickson, from the Digitization Centre, have discovered a process to generate searchable transcripts for non-Latin text. The idea originated from an article about a workshop on Optical Character Recognition for Bangla. The result of the workshop was the realization that Google Drive was the most accurate tool for generating transcripts for non-Latin text.

With that information in hand, Ferris and Dickson started to explore Google Drive to create an automated workflow for transcribing batches of items.

Are you interested in trying the workflow out for yourself? If so, check the instructions that Rebecca prepared and give it a try!

  1. Access Google Drive, create a “New folder” and rename it
  2. Create a Google Sheet inside the folder
  3. Open the Sheet, click on “Share”, “Receive shared link” and look for the sheet identifier (the numbers and letters between /d/ and /edit?)
  4. In the Sheet, under “Tools” menu, click “Script editor”
  5. Paste the content from “gs” into the script editor
  6. Update the “folderName” with the name of your folder (defined in step 1)
  7. Update the “sheetId” with the identifier that you found in step 3
  8. Click the “clock” icon and select the options: “extractTextOnOpen”, “From spreadsheet” and “On open”
  9. Save the script editor and close it
  10. Upload jpegs to the folder (you can check out the sample items prepared for this work)
  11. Open the spreadsheet and wait for Google to do the work!

 

If you want to check Laura and Rebecca’s presentation about the topic, check out their slides. If you have questions, feel free to contact us.

 

Sources:

A workshop on Optical Character Recognition for Bangla (British Library)

OCR for non-English language text (Pixelating)

Pixelating-ocr (GitHub)

a place of mind, The University of British Columbia

UBC Library

Info:

604.822.6375

Renewals: 

604.822.3115
604.822.2883
250.807.9107

Emergency Procedures | Accessibility | Contact UBC | © Copyright The University of British Columbia

Spam prevention powered by Akismet