This is our final blog post for the JISC RDTF (now Discovery) SALDA project on the completion of the six month project. I’m sure there will be more related blog posts here in the coming months.
Things we have produced
The SALDA Project has produced the following:
The catalogue data of the Mass Observation Archive is now available on the Talis Platform licensed under ODC-PDDL.
Simple text search http://api.talis.com/stores/massobservation/items
Sparql interface at: http://api.talis.com/stores/massobservation/services/sparql
The SALDA XSLT stylesheet is here licensed under modified BSD licence
Chris Keene has created pages for open data at the University of Sussex Library:
The direct link to the SALDA produced data from the Mass Observation Archive is here:
http://data.lib.sussex.ac.uk/data/mass-observation/
Some human readable examples of the data:
http://data.lib.sussex.ac.uk/archive/doc/person/nra/harrissonthomas1911-1976anthropologist
http://data.lib.sussex.ac.uk/archive/id/archivalresource/gb181SxMOA1
The data references terms from (amongst others) the following RDF vocabularies (thanks to Pete Johnston at Eduserv):
http://purl.org/dc/terms/
http://xmlns.com/foaf/0.1/
http://www.w3.org/2004/02/skos/core#
http://www.openarchives.org/ore/terms/
http://linkedevents.org/ontology/
http://data.archiveshub.ac.uk/def/
Pete has also produced browse pages for concepts, people and places which offer other ways into the data and are great for showing the data. This is in addition to our core deliverables and are not live yet.
In-house cataloguing guidelines
An unexpected result of the SALDA project was a review of our cataloguing procedures and the following guides were produced by myself and a colleague Adam Harwood who is currently cataloguing the University of Sussex Collection.
CALM_ISADG_Collection level This document maps the required ISAD G fields to the CALM fields with guidelines on how to populate the fields. We have also included the fields required for export to EAD using the Archive Hub report on CALM.
cataloguing procedures component level This document provides guidelines for completing componant level records in CALM.
Next steps
Now the data is on the platform, we will advertise it at open data days. We are working on a leaflet which invites anyone to work with our data and see what they can do.
We are working with our partners at the Keep on the IT infrastructure for the new development. The SALDA project opened dialogue on Linked Data and has provided a useful skills and knowledge set of another route to take in order to share data between the partners.
At Sussex, we are going to look at our collections and make a prority list of ones where the catalogue data could be turned into Linked Data by considering:
- If we can make the data available under ODC-PDDL
- what changes/ additions we need to make to the data and it’s structure
- What the potiential uses/ benefits are
A a personal goal, I would like to work with archivists and developers to find common ground about Linked Data, about the understanding, the uses and the benefits. And what words we use to describe it and finding examples of it in use because Linked Data is very behind the scenes so can be hard to “sell” without an example of its use in human readable format. I also attended a brilliant “legal update for information professionals” workshop led by Niaomi Korn and Professor Charles Oppenheim which really got me interested in risk management which relates to the licensing part of the project.
Evidence of reuse
We have registered the dataset on CKAN and hope to be part of the current UK discovery competition
Skills
This has been a steep learning curve for me as project manager to get my head around the world of Linked Data. All praise to Pete Johnston who is able to write in a way that I understand, yet still convey the level of technical detail that is required.
Pete has provided the expertise on the project, working with scripts devised for the Locah project and adapting them for SALDA. He has been working with Chris to move the data to the platform and the scripts used to our data.lib.sussex.ac.uk URI. You can read more about this in Chris’s blog post
We are grateful to all the team at the Locah project for forging the path ahead and allowing us to follow in their footsteps.
Chris Keene has created webpages for open data at the University of Sussex Library to keep open data on the agenda. Openness is reflecting the the strategic goals of the Library e-strategy: Search and discovery 2011-2015
We’ve all learnt more about archival metadata and EAD during the project.
Most significant lessons
Now then, these might be a bit basic and from my own experience. I’m sure my technical colleagues could add to them though the lessons we have learnt and the processes we have been through in technical areas are well documented on this blog.
- At the beginning, no one (archive colleagues, library colleagues, friends, family) will know what you are talking about when you mention Linked Data. When you show an example or try and explain it they will look blank. You need to work out a way of explaining and demonstrating it that can be understood.
- Keep in regular contact with technical consultants if they are not part of the in-house team. We had a face to face meeting, phone calls and regular (weekly) email contact.
- Think long term about the sustainability and future uses of the data even if it’s only a six month project. We thought long and hard about our URI stem to make it as generic and sustainable as possible and try and re-use URIs rather than making lots of new ones.