Live-blogging the 2009 Vancouver PKP Conference

PKP Open Archives Harvester for the Veterinarian Academic Community: The Session Blog

Date: July 9, 2009

Presenters: Astrid van Wesenbeeck and Martin van Luijt – Utrecht University

PKP 2009

Photo taken at PKP 2009, with permission

Astrid van Wesenbeeck is Publishing Advisor for Igitur, Utrecht University Library
Martin van Luijt is the Head of Innovation and Development, Utrecht University Library



Powerpoint presentation used with permission of Martin van Luijt

Quote: “We always want to work with our clients. The contributions from our users are very important to us.”

Session Overview

The University Library is 425 years old this year. While they are not scientists or students, they have a mission to provide services that meet the needs of their clients. Omega-integrated searches bring in all metadata and indexes it from publishers and open access areas.

Features discussed included the institutional repository, digitization and journals [mostly open and digital, total about 10 000 digitized archives].

Virtual Knowledge Centers [see related link below]

– this is the area of their most recent work
– shifts knowledge sharing from library to centers
– see slides of this presentation for more detail

The Problem They Saw:

We all have open access repositories now. How do you find what you need? There are too many repositories for a researcher to find information.

The Scenario

They chose to address this problem by targeting the needs of a specific group of users. The motivation – a one-stop shop for users and increased visibility for scientists.

The Solution:

Build an open-access subject repository, targeted at veterinarians,  containing the content of at least 5 high-profile veterinarian institutions and meeting other selected standards.

It was organized by cooperating to create a project board and a project team consisting of knowledge specialists and other essential people. The user interface was shaped by the users.

Their Findings:

Searching was not sufficient, the repository content, to use his word, “Ouch!” Metadata quality varied wildly, relevant material was not discernible, non-accessible content existed and there were low quantities in repositories.

Ingredients Needed:

A harvester to fetch content from open archives.

Ingredients Needed 2:

Fetch more content from many more archives, filter it and put it into records and entries through a harvester, then normalize each archive, and put it through a 2000+ keyword filter. This resulted in 700,000+ objects.

Ingredients 3:

Use the harvester, filter it and develop a search engine and finally, a user interface.

Problem: The users wanted a search history and pushed them into dreaming up a way of doing that without a login. As designers, they did not want or need that login, but at first saw no way around a login in order to connect the history to the user. Further discussion revealed that the users did not have a problem with a system where the history did not follow them from computer to computer. A surprise to the designers, but it allowed for a login-free system.

Results: Much better research. Connected Repositories: Cornell, DOAJ, Glasgow, Ugitur, etc.

Workshop Discussion and Questions:

1. How do you design an intelligent filter for searches? [gentleman also working to design a similar search engine] Re-harvesting occurs every night with the PKP harvester rerunning objects through the filter. Incremental harvests are quick. Full harvests take a long time, a couple weeks, so they try not to do them.

2. Do you use the PKP harvester and normalization tools in PKP? We started, but found that we needed to do more and produced a tool outside the harvester.

3. <Question not heard> It was the goal to find more partners to build the tool and its features. We failed. In the evaluation phase, we will decide if this is the right moment to roll out this tool. From a technical viewpoint, it is too early. We may need 1 to 2 years to fill the repositories. If you are interested in starting your own, we would be delighted to talk to you.

4. I’m interested in developing a journal. Of all your repositories, do you use persistent identifiers? How do I know that years down the road I will still find these things? Is anyone interested in developing image repositories? There is a Netherlands initiative to build a repository with persistent identifiers. What about image repositories? No. There are image platforms.

5. Attendee comment: I’m from the UK. If valuable, we’ll have to fight to protect these systems because of budget cuts and the publishers fighting. So, to keep value, we’ll have to convince government about it.

Related Links:

OAI6 talk in Virtulal Knowledge Centers

University Library at Utrecht

Online Journal

Open access interview

NARCIS, a dutch repository of theses

First Monday article

Posted by Jim Batchelor, time, date