A tag-based indexer for items in D(elicious)Space?


cloudythinking, originally uploaded by zanzo.

I mentioned in my previous post that Julià Minguillón and I have been throwing around Mike Caulfield’s suggestion that we think about how we might exploit a third party system with a good API such as Delicious.

For a while, I was just thinking it might be used as a social adjunct to the main search indexing, something like the Social Search view on the MACE Portal. But Julià seems prepared to push the idea much further. The following text is his:

We would like to use DSPACE just for storing content, not for browsing or searching. Each piece of content receives a permanent URL that will be used for retrieving it. Let’s suppose we set up a Delicious account which is going to be primarily managed by the “repository gardener.” Every time we add a new resource to the repository in DSPACE we bookmark the URL provided by DSPACE using Delicious and we add all tags there, we do not add tags as metadata fields in DSPACE.

The tags added to the Delicious entry for every piece of content would ideally be a mixture of basic tags, keywords, etc. but it would be interesting to add some tags like “dc.title=”, etc., following the Dublin Core metadata scheme (or in fact, any other scheme such as LOM, or even a mixture of both).

It would be great to use the delicious API to simplify the process of adding such tags, especially those related to metadata fields. Then, once the gardener determines that the tags for one item are complete, he or she could tag it as “validated” or whatever in order to allow another process to retrieve such validated resources from Delicious and automatically import metadata into DSPACE using its API.

In fact, such objects could be automatically tracked using RSS so future changes or updates can be imported into DSPACE easily (mainly new metadata) as long as the gardener “approves” such changes. The basic idea of this thing is to take advantage of all social and web 2.0 tools for building tag clouds, searching (i.e. Director), etc., allowing users to “use” DSPACE within their blogs, learning spaces, etc. without using DSPACE.

There are a number of reasons this approach appeals to me:

* It builds on a fairly intuitive and user-friendly system, one with a sizable user base, with a well-established API. There are a ton of third-party mash-ups already built, and I would expect that further custom development and interfaces would be relatively easy. Delicious also has fantastic RSS support for accounts, tags, and (as I just discovered) individual URLs on the web that have been bookmarked.

* It builds on a system that is already widely used. People could interact with the system externally via their own accounts via features such as the network, subscriptions, and inbox, or simply suggesting items by using a tag.

* It can incorporate resources from outside the repository into the browser. It also brings the resources from the repository into the wider web ecosystem. Students would be able to access the resources after they’ve graduated from the course.

I do have a few more questions that linger…

* Are their risks I am not accounting for? I suppose there is the standard risk of using a third-party system, but the workflow model Julià proposes would allow for the importing of keywords into the DSpace repository. And it is fairly easy to get data out of Delicious. But maybe an open source Delicious clone such as Scuttle would be better?

* I have not done an exhaustive check of alternatives to Delicious that might offer features that could be useful.

* Does anyone have favorite sites or clients for tag management and visualizations? We would be especially interested in how to connect concepts. I can’t for the life of me figure out how to work this thing. But it looks very close to what we need.

Oh yes, I can’t seem to load my bookmarks and tags into the Director Interface either. I go through the steps, and it says it will not recognise my account, that it is “Unable to load [my] bookmarks!”

Any other thoughts or suggestions are welcome!

About Brian

I am a Strategist and Discoordinator with UBC's Centre for Teaching, Learning and Technology. My main blogging space is Abject Learning, and I sporadically update a short bio with publications and presentations over there as well...
This entry was posted in Uncategorized. Bookmark the permalink.

10 Responses to A tag-based indexer for items in D(elicious)Space?

  1. Dave says:

    So…Julià’s suggestion is:
    1) put new content in DSpace
    2) bookmark new content with Delicious, and tag with metadata
    3) automatically import metadata into DSPACE using its API

    Wouldn’t it make more sense to just add the metadata in DSpace to begin with? This project’s goal seems to be “I like using Delicious, let’s find a way to force people to go through Delicious when they want to use DSpace.”

    A better goal might be “Let’s explore whether it makes sense to take some of the features I like about Delicious and add them to DSpace.” or “Let’s explore ways that we can help all kinds of existing tools connect with DSpace.” No sense in getting locked in with a 3rd-party tool.

  2. Scott Leslie says:

    Yeah, this seems kind of counterintuitive to me. I mean if you are uploading the content itself to DSpace, you are most of the way there, why not add the metadata there too. It seems like the issue is that DSpace’s search and browse interface (along with its contribution workflow) is being found deficient. What, exactly, is it that they *like* about DSpace? It’s “bitstreams’ approach to preservation, and that’s it? In that case, if it’s not too late, might I recommend Fedora, which only provides the archiving pieces as a service for which you build your own front end. This way you could put Drupal or whatever you liked in front of it and still have the benefits of an archival repository on the back end.

    If DSpace is not negotiable, I’d suggest looking at the documentation on customizing it’s contribution workflow (http://www.dspace.org/1_5_1Documentation/ch11.html ) to determine if you couldn’t create a flow that was more lightweight, and included tags. The idea, though, of someone using a tag like “dc.title:” is mind boggling, though – if there’s anything worse than structured metadata, it’s trying to get folksonomic tools to support structured metadata.

    (Sorry if this sounds harsh, I am grumpy this morning. I like the idea of wrapping archival tools with social interfaces, but the above seems not that clearly conceptualized).

  3. Julià Minguillón says:

    well, I understand your point and you’re completely right, I was not very clear, I really appreciate your comments.

    Of course we can provide DSPACE with all the metadata available in the moment of uploading any piece of content, and then such metadata can be hopefully exported to delicious by using RSS (does delicious allow you to automatically import “content” from RSS files). The idea was minimizing the workflow while using DSPACE. And I don’t expect learners using “dc.title:whatever”, of course, my first idea was providing them with official metadata and allow them to add new tags, in order to tag content according to their needs. My idea was using this special tags (dc.whatever) for building tag clouds, dynamic treemaps, visual taxonomies, etc. using the currently available tools. I think I was just trying to create a simple nice new interface for DSPACE using delicious, trying to combine official tags with tags used by users to describe the contents they find useful.

    On the other hand, it’s not too late to consider Fedora, although DSPACE is used already at UOC, and people responsible of technological issues will not be willing to maintain two different approaches for maintaining a repository, I guess. And we would like to have a clean DSPACE installation, so we can upgrade it without having to review all the code we might need to add to DSPACE if we modify it in order to add extra functionalities already present in delicious, for example.

    In fact most of our learners do not use delicious at all, so we will probably need to provide them with a simple tool integrated in the virtual campus (our LMS) for adding metadata. I think that for us will be easier to integrate delicious within our LMS than DSPACE.

    I see this idea needs more deep thinking…

    Thank you all!

  4. I may be drawing you further off path here, but what I was thinking of more specifically is that with delicious you could do something like a greasemonkey script or firefox extension.

    I’m woefully unfamiliar with DSpace’s interface, but I’m assuming there is a unique URL associated with these resources — if that URL can be read by a greasemonkey script or extension I would assume that it’s an easy matter to insert a form on that page dynamically that allows for annotation (I’ve seen this done, for instance, with the Remember The Milk GMail Firefox extension).

    As far as what you get, once again, since I’m not familiar with DSpace, I can’t say what the benefits would be relative to a native DSpace solution — but I specced out a citizen science reporting system similar to this a while back, and the conversation between me two other developers kept circling back to this question of why the loose coupling was important (because the person we were advising kept looking at us a little crazy).

    And the answer we had to keep coming back to is that yes, it’s easier and more straightforward to extend systems where you have access to the code — but if it is a solution where you need access to the code to solve it, well — you’ve just locked everyone else out from co-developing, and solving things by their own injenuity, and organic systems. The strength of a system in delicious is it really encourages people to fuck with it in a way (possibly) that a system in DSpace might not. And it allows for unanticipated benefits — when someone develops a facebook / delicious integration app — voila, you suddenly have a dspace facebook community, without lifting a finger.

    You have to ask yourself — will my architecture evolve without me? And the Web 2.0 lesson is if the answer to that is no, you’re building it wrong.

    I’d really have to see the interface as it stands to see if what I’m saying here applies well though. And I should do that anyway, for professional reasons. Let me know if you have time for a tour.

  5. Tannis says:

    Maybe I’m a little enamoured with VUE right now, but couldn’t you do this:

    1. dedicate a server for content
    2. use VUE as an interface to that content, and as a way of organizing and identifying it (you can also link to external URLs, meeting your concern of bringing in outside resources). Also, I have no idea what Fedora is, but apparently it supports it (http://vue.tufts.edu/features/index.cfm).
    3. By using VUE as an interface, you can tag, use ontologies, and other good stuff that sounds great but I have yet to explore. Plus, the idea of contextualising resources, in addition to tagging, seems to me to be a useful feature, since VUE maps could be created for different Faculties or disciplines, but still be part of a larger whole.

    Maybe the thing missing here is the RSS?

  6. Scott says:

    Maybe a central repo that is auto updated.
    technorati style blogs, wikis, LCMS who want to be hosted ping via RPC the central content repo. The repo reads in the RSS feed (like technorati) stores the content.

    Useful content can be ranked via how many use remix etc(like technorati) crud floats to the bottom. There could be a 100 list for each category.

    RSS feeds include the taxonomy. will self organize. Or could integrate bookmarking.

    For bookmarking this looks promising opensource port of Magnolia:

    http://ma.gnolia.org/
    or
    Connotea very popular in the Life Sciences might also be useful:
    http://www.connotea.org/
    download:
    http://sourceforge.net/projects/connotea

    If people have to create the content in two places it will never fly has to be automated. I am working on a central repo using Eprints for a research centre. Although not as good as Dspace it is one more step getting stuff into it.

  7. Brian says:

    “Dave” – I don’t think anyone is trying to force anyone to use Delicious that would prefer to use DSpace. There has been some consideration of adding the features needed to DSpace directly, but anything significant would require hacking the code, and verging off the main DSpace build, which could conflict with future development with the core system. So whatever is done needs to be through the API. As for “Let’s explore ways that we can help all kinds of existing tools connect with DSpace.” – I thought that was what we were doing. And I don’t see how we would be locked into Delicious…

    Scott – I think Julià addresses your points in terms of the UOC’s perspective re:DSpace, and I can’t think of much to add. I’ll need to mull your point re structured/folksonomic metadata – you may be right. Though one nice thing about our present experiement is we can hack together a working body of resources and metadata very quickly.

    Mike – your points are very interesting. Julià and I were discussing the problem of access to code systems this morning. Maybe we should be exploring a Greasemonkey script for some of the workflow – though I don’t think we can require Firefox, etc… for end users. As for the current interface, right now the UOC interface for DSpace is the standard one you see with most installations.

    Tannis – we actually have played around with VUE, Julià used it to create this concept map. But correct me if I’m wrong, but isn’t VUE a client side application? I don’t see it as a web-based interface. Maybe I’m missing something. I believe the Fedora support referred to is the ability to access and import resources in the VUE interface.

    We can never have to much RSS.

    Oh, BTW, Tannis, I showed Julià your self-guided learning statistics by OER blog, he was impressed!

  8. Juli`a Minguill'on says:

    Dear Tannis,

    if Brian has nothing against it, I would like you to explain how you select the resources you try to learn, concretely, do you have a goal in mind (i.e. comparing two distributions) and then try to select the best resources or maybe you have a syllabus and you follow it?

    anyway, I’m really interested because one of the things we are discussing with Brian is combining different visualization techniques for browsing the repository of statistical learning resources, in order to help learners to contextualize the contents they are using in a specific learning activity, that is, taxonomies in form of dynamic trees, tag clouds for keywords and lists of desired learning goals or competences

    thank you in advance!

  9. Paul Joseph says:

    Well, I wanted to write a rambling piece on the merits of structured metadata (one of the hallmarks of DSpace as a useful repository application), but that would be a digression. Rather, I’ll throw my two bit in on the subject of Delicious and propose a 2.0 alternative. I do want to reiterate that I think the preferred method for what you’re looking for is to build a social network using Drupal sitting on Fedora Commons – that would provide the most flexibility, but would require the most work.

    While Delicious is handy, it does have limitations: (1) it cannot ingest content in batches; (2) it’s fickle with tag management (it wouldn’t be difficult for me to spell actuarial actuerial) which either leads to serious tag management, the construction of very complex RSS feeds (include tags: actuarial, actuerial, actuareil…), or omitted content by virtue of a spelling mistake); (3) a user cannot tag other users’ content; (4) the notes field is restricted to 1000 characters. Brian has already outlined the positives, so I’ll skip that bit.

    As for a more flexible 2.0 tool that would require minimal effort to get up and running – at least as a prototype interface on top of Dspace is… flickr. Ok, it sounds a bit odd, I know. But we’re just throwing around ideas, right? Each object in the repository can be represented by some kind of image in flickr. The Description field can include all that lovely structured metadata: Title, Author, Subject… and include the persistent URL to the object. The tags can reiterate the metadata and others can contribute tags not included in the original metadata. Users can comment on the objects. You’ve got a robust API to do some fancy things with the flickr content using tag clouds and link rolls. You can batch upload images (that is, images as representation of the DSpace objects). Where can we see this in action? There’s a few large repositories that feed there content into flickr already: the McCord Museum, the Library of Congress, and the Powerhouse Museum. Each of these flickr users permit users to tag, comment, and place notes on their content. Each one links back to the object in its repository. It might not be fancy, but it works. And it opens the repository to a very large community. Obviously there’s going to be issues about my proposal – I still have the same concerns about the consistency of tagging, though at least flickr allows for full text searching – but I’ve said enough. How’s that for a crazy librarian’s ideas? Most of what I have to say is generally less outlandish. But I thought I’d take off the glasses, loosen the cardigan, and have some fun.

    pj

  10. Imma,

    You may find useful the “Social Tag Importer” add-on we released last year.

    Here’s the link: http://sourceforge.net/tracker/index.php?func=detail&aid=1835387&group_id=19984&atid=319984

    It works just for Connotea, but I have a student that will start working on it soon for a new release. If you have any requirements, let us know. Maybe we can implement some of them.

    Cheers,

    Ana

Comments are closed.