Giter VIP home page Giter VIP logo

safcreator's Introduction

SAF Creator

is a desktop application written in Java. Its purpose is to prepare Simple Archive Format (SAF) archives for importation into DSpace repositories. There are a number of good tools for this purpose, and every use case is different. Many digital curators choose to package their SAF with local custom scripts. But general purpose tools can be immensely useful, especially when supplemented by custom scripts. Other popular general-purpose SAF support tools that may meet your needs include PySAF and SAFBuilder.

Deployment basics

Running SAFCreator requires a JVM.

If you prefer not to build from source, a compiled jar is provided at this link: https://github.com/jcreel/SAFCreator/raw/master/jarfile/SAFCreator-0.0.2-SNAPSHOT.one-jar.jar

Building SAFCreator requires Apache Maven. Build with "mvn clean package". Run (replacing the version as appropriate) with java -jar target/SAFCreator-0.0.2-SNAPSHOT.one-jar.jar

Usage instructions

Basically, you need a spreadsheet (a CSV file) with the metadata and references to the files. Each row represents one item, and each column a metadata field. This is a typical starting format for digital library metadata.

To make the references to the files, there needs to be (at least) one special column having the heading “filename” or “bundle:ORIGINAL” if you want to specify the bundle. You’d typically replace ORIGINAL with another bundle. Just using “filename” defaults to the ORIGINAL bundle. You can have multiple columns for multiple bundles. Then in the column under that heading, you would have the filenames (separated by double bar ||) of the bitstreams to go in that item. You can also use subpaths relative to the top level directory of the files, and * as a wildard to include everything under a subpath.

The other headings would be dc-style field labels, e.g. “dc.title” or “dc.description.abstract”. And the cells in that column would be the values (again, double bar || delimited) for that field for each item.

You can find a couple of example projects at https://github.com/jcreel/SAFCreator/tree/master/src/main/resources/SAF.

In the SAFCreator, you need to use the file picker to select the CSV file, select the directory where the files are (that are referred to in the “filename” or bundle columns), and select the directory where you want to write the SAF. You need load the batch with the button, then go over to the validation tab and validate the batch, and then go back to the first tab and do the writing.

Character encoding issues on Windows

DSpace works best with everything encoded in UTF-8, but the JVM on a Windows box will default to the local encoding. This can be addressed by running Java with the -Dfile.encoding=UTF-8 flag. E.g. java -jar -Dfile.encoding=UTF-8 SAFCreator-0.0.2-SNAPSHOT.one-jar.jar

Thanks to Eric Pennington for providing this solution.

For those in a hurry

Again, if you want a direct download to avoid building from source, you can grab it here: https://github.com/jcreel/SAFCreator/raw/master/jarfile/SAFCreator-0.0.2-SNAPSHOT.one-jar.jar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.