Importing Backissues into OJS: Development of an OJS Import Script with Django: the Session Blog
Presenter: Syd Weidman, Library Systems Supervisor, University of Winnipeg – Session Abstract
Session Overview
Why was this an issue?
With the transition to open-access publishing of several journals at the University of Winnipeg, Syd Weidman and the University library have been involved in multiple aspects of this transformation. Given that these journals have been in print for decades, one of the major obstacles that needed to be addressed was the importation of back issues into an online, open-access compatible format.
Initial attempts using the available software proved difficult. They were met with bugs and their associated patches; overall, Syd described the process as “laborious and convoluted”. He surmised that “in the context of importing [a large volume of] back issues, small efficiencies [may] have a large impact.” With this notion in mind, Syd began work on the Open Journal Systems (OJS) Import Project.
Tackling the problem – Use of Django
Syd highlighted the basic design goals of any software to be used for this purpose; he stressed that the process needed to be as EASY as possible. He sought to optimize the software’s ease of CONSTRUCTION, USE, DEPLOYMENT and MAINTENANCE. Being most familiar and comfortable with the Python programming language, Syd opted to use the Django Web framework to build a Web-based application to carry out the task of importing back issues.
Django is an open-source framework that was initially used by the online publishing industry. With a short digression, Syd took a moment to review the “4 freedoms” of open-source software, being the “free” use of software for any purpose, free access to its code, freedom to modify as well as an understanding that improvements will be shared with others (for more, take a look at the Free Software Foundation’s website.) Django, in particular, has several advantages over other similar frameworks, namely:
- object relational mapping – allows use of fewer lines of programming, increasing robustness
- automatic administrator interface
- elegant URL design
- pluggable template system
- flexible and robust cache system
- i18n compatible – allows for the application to be adapted to other languages without significant engineering changes
- excellent documentation
- an active mailing list (a double-edged sword!)
Success!
With the development of the new importation software, the U of W was able to scan backissues into .pdf format, to ultimately be uploaded into their respective online journals. This required the entering of appropriate metadata in order to allow for accurate archiving and searching.
Challenges and future directions
One of the difficulties in developing a script for another piece of software is to ensure that they remain in sync when new versions appear. In a subsequent OJS release following the development of the OJS importing application, incompatibilities/bugs appeared, and needed patching.
Commentary/Questions
Just prior to the question period, Syd mentioned the recent development of another application, “Quick Submit”, which may now be able to perform similar functions to his program.
Related Links
University of Winnipeg library (and their OA publications: Canadian Bulletin of Medical History, Journal of Mennonite Studies and the Canadian Children’s Literature Journal) Python programming language Django frameworkReferences
Weidman, S. (2009). Importing backissues into ojs: development of an ojs import script with django. PKP Scholarly Publishing Conference 2009. Retrieved 2009-07-08, from http://pkp.sfu.ca/ocs/pkp/index.php/pkp2009/pkp2009/paper/view/190
July 11, 2009 Comments Off on Importing Backissues into OJS: Development of an OJS Import Script with Django: the Session Blog
PKP Open Archives Harvester for the Veterinarian Academic Community: The Session Blog
Date: July 9, 2009
Presenters: Astrid van Wesenbeeck and Martin van Luijt – Utrecht University
Astrid van Wesenbeeck is Publishing Advisor for Igitur, Utrecht University Library
Martin van Luijt is the Head of Innovation and Development, Utrecht University Library
Abstract
Presentation:
Powerpoint presentation used with permission of Martin van Luijt
Quote: “We always want to work with our clients. The contributions from our users are very important to us.”
Session Overview
The University Library is 425 years old this year. While they are not scientists or students, they have a mission to provide services that meet the needs of their clients. Omega-integrated searches bring in all metadata and indexes it from publishers and open access areas.
Features discussed included the institutional repository, digitization and journals [mostly open and digital, total about 10 000 digitized archives].
Virtual Knowledge Centers [see related link below]
– this is the area of their most recent work
– shifts knowledge sharing from library to centers
– see slides of this presentation for more detail
The Problem They Saw:
We all have open access repositories now. How do you find what you need? There are too many repositories for a researcher to find information.
The Scenario
They chose to address this problem by targeting the needs of a specific group of users. The motivation – a one-stop shop for users and increased visibility for scientists.
The Solution:
Build an open-access subject repository, targeted at veterinarians, containing the content of at least 5 high-profile veterinarian institutions and meeting other selected standards.
It was organized by cooperating to create a project board and a project team consisting of knowledge specialists and other essential people. The user interface was shaped by the users.
Their Findings:
Searching was not sufficient, the repository content, to use his word, “Ouch!” Metadata quality varied wildly, relevant material was not discernible, non-accessible content existed and there were low quantities in repositories.
Ingredients Needed:
A harvester to fetch content from open archives.
Ingredients Needed 2:
Fetch more content from many more archives, filter it and put it into records and entries through a harvester, then normalize each archive, and put it through a 2000+ keyword filter. This resulted in 700,000+ objects.
Ingredients 3:
Use the harvester, filter it and develop a search engine and finally, a user interface.
Problem: The users wanted a search history and pushed them into dreaming up a way of doing that without a login. As designers, they did not want or need that login, but at first saw no way around a login in order to connect the history to the user. Further discussion revealed that the users did not have a problem with a system where the history did not follow them from computer to computer. A surprise to the designers, but it allowed for a login-free system.
Results: Much better research. Connected Repositories: Cornell, DOAJ, Glasgow, Ugitur, etc.
Workshop Discussion and Questions:
1. How do you design an intelligent filter for searches? [gentleman also working to design a similar search engine] Re-harvesting occurs every night with the PKP harvester rerunning objects through the filter. Incremental harvests are quick. Full harvests take a long time, a couple weeks, so they try not to do them.
2. Do you use the PKP harvester and normalization tools in PKP? We started, but found that we needed to do more and produced a tool outside the harvester.
3. <Question not heard> It was the goal to find more partners to build the tool and its features. We failed. In the evaluation phase, we will decide if this is the right moment to roll out this tool. From a technical viewpoint, it is too early. We may need 1 to 2 years to fill the repositories. If you are interested in starting your own, we would be delighted to talk to you.
4. I’m interested in developing a journal. Of all your repositories, do you use persistent identifiers? How do I know that years down the road I will still find these things? Is anyone interested in developing image repositories? There is a Netherlands initiative to build a repository with persistent identifiers. What about image repositories? No. There are image platforms.
5. Attendee comment: I’m from the UK. If valuable, we’ll have to fight to protect these systems because of budget cuts and the publishers fighting. So, to keep value, we’ll have to convince government about it.
Related Links:
OAI6 talk in Virtulal Knowledge Centers
NARCIS, a dutch repository of theses
First Monday article
Posted by Jim Batchelor, time, date
July 9, 2009 Comments Off on PKP Open Archives Harvester for the Veterinarian Academic Community: The Session Blog