About the SALDA Project
Aims, objectives and outputs
A JISC funded project to extract metadata records for the Mass Observation Archive from the University of Sussex Special Collection’s Archival Management System (CALM) and convert them into Linked Data that will be made publicly available.
The key steps are:
- To extract the data from CALM using an EAD export.
- Eduserv will transform the data in to Linked Data building on templates already in development for the JISC LOCAH Project
- UKOLN will then work on a package to enhance, using scripts to improve the data, to help ensure it integrates in to the Linked Data eco-system
The final output from this project will comprise a set of linked data relating to the Mass Observation Archive that will be made publicly available via XML. We aim to make it available on the Talis Platform. We will share our experiences of this project including documentation, tools and software so that others can achieve the same result.
Wider benefits to the sector and achievements for the host institution
The experience and data gained from the SALDA Project will provide us with invaluable skills and an understanding of Linked Data, the methodologies involved in transforming our data and making it available as open data. The knowledge gained will be fed into one of the key work packages of The Keep project which brings together East Sussex Record Office, Brighton and Hove Council and the University of Sussex Special C0llections under one roof in a purpose built Archive repository. There is a clearly identified need to create a unified discovery system within The Keep and other shared systems and services across a diverse number of collections. In addition, The University of Sussex Library Strategic Plan has goals for opening up our data for others to use as well as working with national partners. We would look at how the project skills and knowledge could be rolled out to other collections within the Library Special Collections.
Project Work Plan
- WP1 Set up project team/steering group including creation of a blog
- WP2 IPR issues investigation: research Open Data Commons Public Domain Dedication and License, compare with Creative Commons license
- WP3 Stakeholder needs analysis
- WP4 Extraction of data from CALM database in EAD format. This includes refinements to our data inside CALM including adding additions fields to make it compatible with the Archives Hub EAD template and expanding abbrievations for names and organisations to find ways into the data
- WP5 Transformation of extracted data into Linked Data by Pete Johnston at Eduserve
- WP6 Interim review of project including discussions on enhancement proposals and stakeholder group meeting
- WP7 Enhancement and refining of data to comply with ontologies and standards by Julian Cheal at UKOLN
- WP8 Load data onto Talis Platform and make publically available
- WP9 Documentation of process, release tools and scripts used
- WP10 Evaluation package which will include a report to stakeholders
- WP11 Completion report (including financial statement)
- WP12 Dissemination (including event for The Keep)
The Project Manager is in post from week 3 so the 26 week plan has been adjusted accordingly.
|Risk||Probability||Impact||Action to prevent/manage risk|
|Key project staff not recruited||Low||High||Existing resource can be seconded at University of Sussex if needed. Networks in this area are strong on the proposed project team|
|Failure to retain key project staff and contractors during the lifetime of the project||Low||High||Existing resource can be seconded at University of Sussex if needed. Networks in this area are strong on the proposed project team. HE (and wider) community contains organisations and consultants with the relevant experience and knowledge.|
|Issues around IPR||Low||Medium||There is a wealth of experience and support in this area, as the MOA already deals extensively with IPR issues due to the nature of the collections. We will also consult with JISC Legal and the RDTF management framework project. An early work package will look into these issues.|
|Difficulties in exporting and converting data||low||High||CALM supports export of data into EAD with a number of profiles, including one for the Archives Hub. Specific fields will need to exist for the records in question both to export into EAD and for converting into Linked Data. The budget includes funding of Special Collections staff to work on making these changes.|
|August 10– July 11||TOTAL £|
|Total Directly Incurred Staff (A) – Project manager and consultancy
|Non-Staff||August 10– July 11||TOTAL £
|Travel and expenses||£3,000||£3,000|
|Total Directly Incurred Non-Staff (B)
|Directly Incurred Total (C)
|Directly Allocated||August 10– July 11||TOTAL £
|Directly Allocated Total (D)||£13,423||£13,423|
|Indirect Costs (E)||£9,414||£9,414|
|Total Project Cost (C+D+E)||£55,152||£55,152|
|Amount Requested from JISC||£35,000||£35,000|
|Percentage Contributions over the life of the project||JISC
We plan to release the data under the Open Data Commons Public Domain Dedication and Licence (PDDL). A dedicated blog post will detail our decision making process
Karen Watson, Project Manager. I am seconded from my role as Special Collections Supervisor at the University of Sussex. A qualified archivist since 2009, I have worked with the collections and the Mass Observation Archive for 8 years. I am responsible for managing the SALDA project, making sure objectives are achieved on time and sharing information about the project with stakeholders and the wider community. I am working with Special Collections staff to refine our catalogue data . I am responsible for detailing the progress of the project on the blog. I will hold stakeholder meetings throughout the project to report on progress and successes, and provide a workshop for Keep partners to to discuss how SALDA outcomes can be used.
Fiona Courage, Special Collections Manager curates the Special Collections held at the University of Sussex, including the Mass Observation Archive, and has worked with the collections for 10 years. Following the acquisition of an Archival Management System (CALM) in 2009, Fiona managed the conversion project of over 50,000 records from HTML-based lists to EAD-compliant records. She has recently played a key role in the JISC MOCO project, creating an online resource for local groups to contribute multimedia records of everyday life supported by examples from the Mass Observation Archive. She is responsible for the care and research accessibility of the collections and has a particular interest in opening the collection up for learning and teaching for HE and the wider community. She was awarded a Sussex Excellence in Teaching award in 2009 and a Teaching Fellowship award in 2010. She has also recently been successful in attracting funding for a joint ITE project with the Department of Education at Sussex using the MOA collections. Fiona is line manager for Karen and part of the SALDA project team.
Chris Keene, Technical Development Manager has worked with IT in HE Libraries for over 10 years. He has recently managed the JISC CReDAUL project, implementing an open-source discovery system to search two university library catalogues of Brighton and Sussex. The project included exporting records from the two LMS as MARC and importing them in to the VuFind. He has managed the implementation and running of numerous applications, including the Institutional Repository, CALM, Aquabrowser and SFX/Metalib. Each of these projects has developed his understanding of working with metadata. Sussex was an early adopter of the Talis Aspire reading-list system which is powered by the Talis Platform and Linked Data, he has worked with developers on campus to use the system’s API to access the underlying data to help integrate with other systems. He was involved in the JISC MOSIAC project and a member of the COPAC steering group. Chris provides vital in house technical support and guidance for the SALDA project.
Pete Johnston, Technical Researcher at Eduserv. His recent work has been primarily in the areas of metadata/resource description, repositories and technical interoperability, mainly in the context of higher education, with a particular interest in the use of Semantic Web technologies and the emergence of the ‘Linked Data’ approach. He participates in a number of standards development activities, and is an active contributor to the work of the Dublin Core Metadata Initiative. He is currently a member of the DCMI Usage Board and the DCMI Advisory Board, and is a co-author of several DCMI specifications, including the DCMI Abstract Model. He was a member of the Open Archives Initiative Object Reuse and Exchange (OAI ORE) Technical Committee, and a co-editor of the OAI ORE specifications. Along with Andy Powell, he contributes to the eFoundations weblog. Pete will use experience and knowledge gained on the LOCAH project to transform the data into Linked Data
Julian Cheal is a software developer at UKOLN. He is currently working on the analysis and visualisation of UK open access repository metadata from the RepUK project. He has experience of writing software to process metadata at UKOLN, and has previous development experience at Aberystwyth University.