Texas Digital Library Conference System, 2014 Texas Conference on Digital Libraries

Font Size: 
Introducing Piper, a Repository-Agnostic Batch Deposit Tool
Micah Cooper, James Creel, Doug Hahn, Bruce Herbert, Jeremy Huff, Yu “Lilly” Li, Alexey Maslov, Sarah Potvin

Last modified: 2014-03-13

Abstract


Abstract:

Applications developers and librarians from the Texas A&M University Libraries will introduce Piper, a repository-agnostic content deposit tool. In addition to providing background on the impetus behind its creation and the intended/anticipated user base, we will demonstrate the tool and explain the process of its development.

Impetus.

Prior to Piper’s deployment, batch loads to our DSpace institutional repository were being handled primarily by one developer in the Digital Initiatives unit. DSpace affords various submission workflows for single-item submission, but batches of items must be loaded via the command line on the DSpace server.  This server can be an extremely sensitive environment in large organizations whose business cases require backups, firewalls, and high uptime.  As part of the workflow for batch loads, which came from diverse sources both inside and outside the Libraries, the developer had engineered procedures for metadata quality control prior to deposit. The developer frequently confronted batch loads with missing files or with incomplete or ill-formed metadata.

Design and goals.  

The initial goal of Piper is to allow greater flexibility in our metadata workflow and enable a small group of non-technical staff to perform batch loads. The tool empowers staff with the privileges to assemble, check, and deposit batch loads through a graphical user interface. A central feature of Piper is its ability to validate metadata and files prior to deposit. The tool relies on a suite of automated and customizable verifiers to confirm that metadata are properly encoded and that files are correctly specified.

 In its first phase, Piper is designed to mimic the work of the developer who had previously performed this work, with procedures for validating metadata and files and the flexibility to upload multiple content files and specialized licenses as part of item records. Once Piper has been honed for use as a tool for this specialized group, we plan to expand its functionality and facilitate and promote its usage by the larger Texas A&M community, as part of ongoing efforts to populate our repository with open access publications.

We have developed Piper in an iterative process whereby the customer chooses what features and fixes to be handled in a cycle (typically two weeks) and accepts or rejects the implementations after live testing and demonstration at the end of the cycle.  These practices are informed by the Agile school of project management popular in software development and other technical industries.  In this way we seek to minimize wasted development on unneeded features and enable continuous delivery of value to stakeholders.


Keywords


metadata; batch process; repositories; DSpace

Full Text: Slideshow