Giter VIP home page Giter VIP logo

bagit-java's Introduction

BagIt Library (BIL)

Build Status Travis-CI Build Status (Linux) Appveyor Build Status (Windows) CircleCI
Metrics Coverage Status Github Latest Release Downloads
Documentation License javadoc.io Crowdin Transifex

Description

The BAGIT LIBRARY is a software library intended to support the creation, manipulation, and validation of bags. Its current version is 0.97. It is version aware with the earliest supported version being 0.93.

Requirements

  • Java 8
  • gradle (for development only)

Support

  1. The Digital Curation Google Group (https://groups.google.com/d/forum/digital-curation) is an open discussion list that reaches many of the contributors to and users of this open-source project
  2. If you have found a bug please create a new issue on the issues page
  3. If you would like to contribute, please submit a pull request

Major differences between version 5 and 4.*

Command Line Interface

The 5.x versions do not include a command-line interface. Users who need a command-line utility can continue to use the latest 4.x release (download 4.12.3 or switch to an alternative implementation such as bagit-python or BagIt for Ruby.

Serialization

Starting with the 5.x versions bagit-java no longer supports directly serializing a bag to an archive file. The examples show how to implement a custom serializer for the zip and tar formats.

Fetching

The 5.x versions do not include a core fetch.txt implementation. If you need this functionality, the FetchHttpFileExample example demonstrates how you can implement this feature with your additional application and workflow requirements.

Internationalization

All logging and error messages have been put into a ResourceBundle. This allows for all the messages to be translated to multiple languages and automatically used during runtime. If you would like to contribute to translations please visit https://www.transifex.com/acdha/bagit-java/dashboard/ or https://crowdin.com/project/bagit-java.

New Interfaces

The 5.x version is a complete rewrite of the bagit-java library which attempts to follow modern Java practices and will require some changes to existing code:

Examples of using the new bagit-java library

Create a bag from a folder using version 0.97
Path folder = Paths.get("FolderYouWantToBag");
StandardSupportedAlgorithms algorithm = StandardSupportedAlgorithms.MD5;
boolean includeHiddenFiles = false;
Bag bag = BagCreator.bagInPlace(folder, Arrays.asList(algorithm), includeHiddenFiles);
Read an existing bag (version 0.93 and higher)
Path rootDir = Paths.get("RootDirectoryOfExistingBag");
BagReader reader = new BagReader();
Bag bag = reader.read(rootDir);
Write a Bag object to disk
Path outputDir = Paths.get("WhereYouWantToWriteTheBagTo");
BagWriter.write(bag, outputDir); //where bag is a Bag object
Verify Complete
boolean ignoreHiddenFiles = true;
BagVerifier verifier = new BagVerifier();
verifier.isComplete(bag, ignoreHiddenFiles);
Verify Valid
boolean ignoreHiddenFiles = true;
BagVerifier verifier = new BagVerifier();
verifier.isValid(bag, ignoreHiddenFiles);
Quickly verify by payload-oxum
boolean ignoreHiddenFiles = true;

if(BagVerifier.canQuickVerify(bag)){
  BagVerifier.quicklyVerify(bag, ignoreHiddenFiles);
}
Add other checksum algorithms

You only need to implement 2 interfaces:

public class MyNewSupportedAlgorithm implements SupportedAlgorithm {
  @Override
  public String getMessageDigestName() {
    return "SHA3-256";
  }
  @Override
  public String getBagitName() {
    return "sha3256";
  }
}

public class MyNewNameMapping implements BagitAlgorithmNameToSupportedAlgorithmMapping {
  @Override
  public SupportedAlgorithm getMessageDigestName(String bagitAlgorithmName) {
    if("sha3256".equals(bagitAlgorithmName)){
      return new MyNewSupportedAlgorithm();
    }

    return StandardSupportedAlgorithms.valueOf(bagitAlgorithmName.toUpperCase());
  }
}

and then add the implemented BagitAlgorithmNameToSupportedAlgorithmMapping class to your BagReader or bagVerifier object before using their methods.

Check for potential problems

The BagIt format is extremely flexible and allows for some conditions which are technically allowed but should be avoided to minimize confusion and maximize portability. The BagLinter class allows you to easily check a bag for warnings:

Path rootDir = Paths.get("RootDirectoryOfExistingBag");
BagLinter linter = new BagLinter();
List<BagitWarning> warnings = linter.lintBag(rootDir, Collections.emptyList());

You can provide a list of specific warnings to ignore:

dependencycheckth rootDir = Paths.get("RootDirectoryOfExistingBag");
BagLinter linter = new BagLinter();
List<BagitWarning> warnings = linter.lintBag(rootDir, Arrays.asList(BagitWarning.OLD_BAGIT_VERSION);

Developing Bagit-Java

Bagit-Java uses Gradle for its build system. Check out the great documentation to learn more.

Running tests and code quality checks

Inside the bagit-java root directory, run ./gradlew check.

Uploading to maven central
  1. Follow their guides
  2. http://central.sonatype.org/pages/releasing-the-deployment.html
  3. https://issues.sonatype.org/secure/Dashboard.jspa
  4. Once you have access, to create an official release and upload it you should specify the version by running ./gradlew -Pversion=<VERSION> uploadArchives
  5. Don't forget to tag the repository!
Uploading to jcenter
  1. Follow their guide
  2. https://github.com/bintray/bintray-examples/tree/master/gradle-bintray-plugin-examples
  3. Once you have access, to create an official release and upload it you should specify the version by running ./gradlew -Pversion=<VERSION> bintrayUpload
  4. Don't forget to tag the repository!

Note if using with Eclipse

Simply run ./gradlew eclipse and it will automatically create a eclipse project for you that you can import.

Roadmap for this library

  • Fix bugs/issues reported with new library (on going)
  • Translate to various languages (on going)

bagit-java's People

Contributors

acdha avatar ansell avatar banzaiman avatar calvinwo avatar danshim avatar fangfu avatar johnscancella avatar jscancella avatar kuno-loc avatar mikedurbin avatar mjordan avatar thomasjejkal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bagit-java's Issues

updatebagsize command

Request from Andrew Cook:

On the BIL Commandline Help page there is no command in the list for โ€œupdatebagsizeโ€ (of the bag-info.txt). Would you consider adding this command?

I was doing some experimenting with the updatepayloadoxum command and noticed that when using this command it doesnโ€™t include changing (updating) the Bag-Size field in the bag-info.txt.

I understand the update command will update the Bag-Size field, but it also changes the Bagging-Date field which I would want to keep as the original bagging date. What if someone wants to keep this date as when it was originally bagged, wouldnโ€™t a separate command for updating the bag size be needed? Or does this not make sense?

Unit tests failing on Windows

Running 5.0.0 on Windows 7, Java 1.8

About 30 unit tests are failing, due to Windows FileSystem differences from UNIX - these include

  • Resolving file path
  • Path separators
  • File name restrictions (eg. cannot create a Path object with '\r' or '\n'
  • Differences in behavior when employing delete() method on a temporary file
  • Differences in causing a file or directory to be hidden (must explicitly set property; starting name with "." not enough
  • End of line characters in comparisons of expected and actual content

Attached file lists individual tests
locJunitErrorsAbbrev.docx

File streams are not closed before throwing exceptions

After some issues with file handles being left open after calling into BagIt (similar to issue #7), I found that there are some places where file streams (FileSystems/DirNodes) are not closed before throwing exceptions.

I'm not sure I found all such places in the code, but one place is here in FileSystemFactory.getDirNodeForBag():

    DirNode root = fs.getRoot();
    if (format.isSerialized) {
        if (root.listChildren().size() != 1) {
            throw new RuntimeException("Unable to find bag_dir in serialized bag");
        }
        FileSystemNode bagDirNode = root.listChildren().iterator().next();
        if (! (bagDirNode instanceof DirNode)) {
            throw new RuntimeException("Unable to find bag_dir in serialized bag");
        }
        root = (DirNode)bagDirNode;
    }
    return root;

root should be closed before throwing either RuntimeException.

TarWriter

Are there any plans to again implement a TarWriter?

JavaFX; Don't use a GUI library in a command-line application

When submitting an issue please include:

  • 5.0.0-RC10
  • Ubuntu 16.04

Please format it in the given when then style

For example (from link above):

Given

  • I have a command line application

When

  • I use a GUI library, and it requires Oracle Java

Then

  • Downstream users are sad ๐Ÿ˜ข

Sorry. The GivenWhenThen formatting seems really odd to me.


Would it be possible to use this instead of JavaFX?

BagIt does not close bag properly when inspecting it on Windows 7

On Windows 7, when I use the driver command "verifycomplete" on a bag and I try to delete the bag after, it throws the exception

java.nio.file.FileSystemException: testbaginvalid.zip -> error\testbaginvalid.zip: The process cannot access the file because it is being used by another process.

The exception can be reproduced with

@Test
    public void test() throws Exception {
        Path testBag = Paths.get("testbaginvalid.zip");
        CommandLineBagDriver driver = new CommandLineBagDriver();
        driver.execute(new String[] { "verifycomplete", testBag.toAbsolutePath().toString(), "--noresultfile" });

        // Throws IOException
        // "The process cannot access the file because it is being used by another process."
        // System.gc();
        // Thread.sleep(2000);
        Files.delete(testBag);
    }

When I uncomment the following lines above Files.delete(testBag);

        System.gc();
        Thread.sleep(2000);

The test is able to delete the bag. The message below is also printed to the console sometimes.

Cleaning up unclosed ZipFile for archive C:\Users\user\git\bagit\testbaginvalid.zip

After the test is finished, I am able to delete the file.

The invalid bag had the Bagit.txt file inside filled with random characters. The manifest hashes were replaced with random characters.

It also does not happen only for invalid bags. I attempted it with a valid bag and I was still not able to delete the bag after it is done validating.

I would like to not use System.gc(). Is this a known problem for Windows? I've read here https://issues.jenkins-ci.org/browse/JENKINS-15331 that Windows locks files whenever it is opened by something.

Edit:
I looked in the FileSystemFactory.getDirNodeForBag and it looks like the FileSystem is not getting closed unless it is supposed to get closed further on. When I close this stream, I am able to delete the bag but the validation does not work.

"verifypayloadmanifests" operation returns True if no manifest exists

When run on a bag with no payload manifests, the "verifypayloadmanifests" operation returns true. This skipped test is indistinguishable from the result of "manifest checked, all files valid".

It would make sense for this to return an error - if "verify payload manifests" is explicitly requested, it seems strange to report success if there is no manifest to verify.

Error in SimpleResult.add*Message()

In most of the SimpleResult.add*Message(), the constructor for SimpleMessage is invoked with arguments in the wrong order.

public void addWarningMessage(String code, String message) {
        SimpleMessage simpleMessage = new SimpleMessage(message, code);
        simpleMessage.setMessageType(SimpleMessage.MESSAGE_TYPE_WARNING);
        this.addSimpleMessage(simpleMessage);
    }
    public SimpleMessage(String code, String message, String subject) {
        this.code = code;
        this.message = message;
        this.subject = subject;
    }

No data folder generated when the payload is empty

Copy from the end of #35:

In case of an empty payload in version 4.11, I notice that the data folder is not generated at all, presumably due to the bag being empty/having no payload. Is that the expected behaviour? At first glance I would expect an empty data folder to be there, but version 4.11 doesn't do that. In the BagIt documentation however, I read that this data folder MUST exist.

Notice that also NO manifest-md5.txt file is created for a bag with no payload. Again, in the documentation I read "Every bag MUST contain one payload manifest file, ...".

Any Java examples?

Dear all,

I'm searching some documentation about how to use the Java Bagit library. Are there any example codes, tutorials etc.? (It seems so that the Java Doc is not complete, class "bag" contains nearly no doc?).
Thank you!
Best,

Tom

Incorrectly setting Logger name in 2 classes

When submitting an issue please include:

  • In Bagit 5.0
  • On Windows 7, Java 8

Class passed to LoggerFactory in 2 classes is incorrect

  1. Class BagCreator: private static final Logger logger = LoggerFactory.getLogger(BagVerifier.class);
  2. Class BagReader: private static final Logger logger = LoggerFactory.getLogger(PayloadFileExistsInManifestVistor.class);

Handling empty directories

Dear all,

We found a problem when a payload contains an empty directory. The BagIt specification says something about handling empty directories (adding a ".keep" file), maybe the java lib should also act like this? The traceback we get:

Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException
at
gov.loc.repository.bagit.writer.impl.FileSystemWriter.removeExtraFiles(FileSystemWriter.java:232)
at
gov.loc.repository.bagit.writer.impl.FileSystemWriter.write(FileSystemWriter.java:222)
at
de.mpg.rzg.tacoharvest.bagit.CreateBagitContainer.createBagIt(CreateBagitContainer.java:37)

Maybe a change in line 239 in gov.loc.repository.bagit.writer.impl.FileSystemWriter should be enough?:

if (recurse && file.listFiles().length > 0) {

NullPointerException in FileSystemWriter

I get a NullPointer from FileSystemWriter, line 232, who's call originates from line 222 and PreBagImpl, line 134, when I want to create a bag with empty/no payload.

val inputDir = multiDepositDir(settings, datasetID)
val inputDirExists = inputDir.exists
val outputBagDir = // the output directory for the bag

val bagFactory = new BagFactory
val preBag = bagFactory.createPreBag(outputBagDir)
val bag = bagFactory.createBag(outputBagDir)

// no bag.addFilesToPayload(...), since I want an empty payload
bag.makeComplete

val fsw = new FileSystemWriter(bagFactory)
fsw.setTagFilesOnly(true)
fsw.write(bag, outputBagDir)

preBag.makeBagInPlace(Version.V0_97, false)

The problem originates in the last line while calling preBag.makeBagInPlace(Version.V0_97, false) and finally reaches the point of failure in FileSystemWriter at line 232 in dir.listFiles(), which evaluates to null. According to the documentation on this expression, this should only happen if dir does not exist or if an IOException occurs. I assume the former one is the case here, which would easily be dealt with by checking for it's existence before calling removeExtraFiles with this directory (call is in line 222 of FileSystemWriter).

Is this observation correct and can you fix this?

some tests fail under windows

from [email protected]
We are looking at adapting your terrific Java Bagit library for use at Portico.

One thing Iโ€™ve noticed, and am wondering if something is woolly in my configuration, or if others have this problem.

Iโ€™m running the unit tests in Eclipse (Mars), Java 1.8, on windows 7 - -and bunch of the unit tests are failing - - pretty much on tripping up on filepath strings โ€“ or more particularly, the separator โ€“ but also simply throwing โ€œfile not foundโ€ exceptions โ€“ reasonably enough, for files/directories that arenโ€™t actually in the resources folders on github - -but I would think Java JVM would also throw exceptions on other OSโ€™s as well.

Some details below -- but what I was wondering is if you heard this from anyone else?

Thanks โ€“ and thanks for the terrific work on the Bagit code.

Sheila

******************************* some details

AddPayloadToBagManifestVistorTest.includeDotKeepFilesInManifest()

Path start = Paths.get(getClass().getClassLoader().getResource("dotKeepExampleBag").getFile()).resolve("data");

java.nio.file.InvalidPathException: Illegal char <:> at index 2: /C:/Users/smorrissey/workspace-nextgen/bagit/target/test-classes/dotKeepExampleBag

Path start = Paths.get(getClass().getClassLoader().getResource("dotKeepExampleBag").getFile()).resolve("data");

URL: file:/C:/Users/smorrissey/workspace-nextgen/bagit/target/test-classes/dotKeepExampleBag

ULR.getFile /C:/Users/smorrissey/workspace-nextgen/bagit/target/test-classes/dotKeepExampleBag

so you are doing paths.get() on /C:/Users/smorrissey/workspace-nextgen/bagit/target/test-classes/dotKeepExampleBag

What you want is paths.get () on C:/Users/smorrissey/workspace-nextgen/bagit/target/test-classes/dotKeepExampleBag

Path start;

               String osName = System.getProperty("os.name");

               if (osName.contains("Windows")){

                              URL url =  getClass().getClassLoader().getResource("dotKeepExampleBag");

                              String urlFile = url.getFile();

                              Path basePath = Paths.get(urlFile.substring(1));

                              start = basePath.resolve("data");

               }

               else {

                          start = Paths.get(getClass().getClassLoader().getResource("dotKeepExampleBag").getFile()).resolve("data");

              }

AddPayloadToBagManifestVistorTest.testSkipHiddenFile()

failing at if(!includeHiddenFiles && Files.isHidden(path) && !path.endsWith(".keep")){ --

Files.isHidden(path) is throwing an exception because the so-called hidden directory does not in fact exist in resources

java.nio.file.NoSuchFileException: \foo.someHiddenDir

           at sun.nio.fs.WindowsException.translateToIOException(Unknown Source)

           at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source)

           at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source)

           at sun.nio.fs.WindowsFileSystemProvider.isHidden(Unknown Source)

           at java.nio.file.Files.isHidden(Unknown Source)

           at gov.loc.repository.bagit.creator.AddPayloadToBagManifestVistor.visitFile(AddPayloadToBagManifestVistor.java:53)

           at gov.loc.repository.bagit.creator.AddPayloadToBagManifestVistorTest.testSkipHiddenFile(AddPayloadToBagManifestVistorTest.java:57)

           at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

           at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

           at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

           at java.lang.reflect.Method.invoke(Unknown Source)

           at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)

           at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)

           at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)

           at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)

           at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)

           at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)

           at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)

           at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)

           at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)

           at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)

           at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)

           at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)

           at org.junit.runners.ParentRunner.run(ParentRunner.java:363)

           at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:86)

           at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)

           at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)

           at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:675)

           at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)

           at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)

Same problem with AddPayloadToBagManifestVistorTest.testSkipHiddenDirectory()

Sheila M. Morrissey

Senior Researcher

ITHAKA

100 Campus Drive

Suite 100

Princeton NJ 08540

609-986-2221

[email protected]

ITHAKA (www.ithaka.org) is a not-for-profit organization that helps the academic community use digital technologies to preserve the scholarly record and to advance research and teaching in sustainable ways. We provide innovative services that benefit higher education, including Ithaka S+R, JSTOR, and Portico.

filenames not percent encoded

both 5.* and 4.* versions of bagit-java do not encode filenames in manifests if the filename contains a newline, carriage feed, or both. They also do not decode filenames that have been properly percent encoded.

Add SHA-3 algorithm support

My organization is using the SHA-3 algorithm (http://en.wikipedia.org/wiki/SHA-3). We would like to request this algorithm to be added to the list of available algorithms that can be used by Bagit.

We would also like to change the algorithm used when updating the data manifest. Currently that feature is only available for updating the tag manifest and when first creating the bag.

ZipWriter compression level cannot be customised

The ZipWriter compression level looks like it could be theoretically customised, but any settings for it, from 2-9, fail with the following error:

java.lang.IllegalArgumentException: Invalid compression level: -5
at org.apache.commons.compress.archivers.zip.ZipArchiveOutputStream.setLevel(ZipArchiveOutputStream.java:712)
at gov.loc.repository.bagit.writer.impl.ZipWriter.startBag(ZipWriter.java:70)
at gov.loc.repository.bagit.impl.AbstractBag.accept(AbstractBag.java:430)
at gov.loc.repository.bagit.writer.impl.ZipWriter.write(ZipWriter.java:158)

The error is due to the following call that uses whatever level the user sent in, and modifies it to be a negative number, which is always going to be invalid:

this.zipOut.setLevel(this.compressionLevel * -1);

Trying to set compressionLevel to be a negative number to fix this fails with the following error:

java.lang.RuntimeException: Valid compression levels are 0-9.
at gov.loc.repository.bagit.writer.impl.ZipWriter.setCompressionLevel(ZipWriter.java:55)

The reason that the default, 1, seems to work is that -1 is apparently a special value, although I am not sure what it represents.

BagVerifier starts user threads that prevent client application from terminating

To reproduce this problem, compile and execute the following code BagVerifierTest.java:

import gov.loc.repository.bagit.domain.Bag;
import gov.loc.repository.bagit.reader.BagReader;
import gov.loc.repository.bagit.verify.BagVerifier;

import java.nio.file.Paths;

public class BagVerifierTest {
    public static void main(String[] args) throws  Exception {
        BagVerifier verifier = new BagVerifier();
        BagReader reader = new BagReader();
        Bag b = reader.read(Paths.get(args[0]));
        verifier.isValid(b, false);
    }
}
javac -cp bagit-5.0-SNAPSHOT.jar:. BagVerifierTest.java
java -cp bagit-5.0-SNAPSHOT.jar:slf4j-api-1.7.6.jar:. BagVerifierTest <some valid bag>

The problem is that the program doesn't terminate. This is probably due to the instantiation of a thread pool here. It uses the default thread factory which creates user (non-daemon) threads. These threads keep running unless you shut them down yourself, and in doing so prevent the application from closing down.

A solution could be to use a custom thread factory instead, that creates daemon threads, or to give the client application a handle to the ExecutorService so that it can call shutdown or shutdownNow

Enhancement: Controlling number of threads used to verify a bag

Given

  • A bag
  • And a requirement to control the number of threads used by Bagit

When

  • I ask to verify the bag

Then

  • I should be able to specify the number of threads used to do verification

This is getting into the weeds for use cases, but right now the code in checkAllFilesListedInManifestExist()

and checkHashes() in BagVerifier are using
final ExecutorService executor = Executors.newCachedThreadPool();

We need something like
final ExecutorService executor = ExecutorService.newFixedThreadPool(int nThreads)

We are more than happy to just implement the use case for ourselves in a subclass - -but doing that in a straightforward way would mean making checkAllFilesListedInManifestExist() and checkHashes() protected methods.

Would that be acceptable?

more easily add files to bag payloads after creation

Given A bag already exists on the filesystem
When I want to add a file to that bag and a few lines of code
Then the bag manifest is now updated/created with that new file and applicable tag files have been updated

Error using bagit-4.12.0

I just downloaded the release for bagit-4.12.0.zip for use of the command line tools on a RHEL 6 box. When I run a specific bagit command or just ./bagit from the 'bin' directory, I get this error:

Exception in thread "main" java.lang.UnsupportedClassVersionError: gov/loc/repository/bagit/driver/CommandLineBagDriver : Unsupported major.minor version 51.0
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
        at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: gov.loc.repository.bagit.driver.CommandLineBagDriver.  Program will exit.

I know that there are plans to remove the CLI in new versions (btw, I think this is a bad idea--many people and programs rely on the CLI) but I thought I would be save using an old 4.x release.

The last line in the 'bagit' file is below. I'm guessing I don't have this class on my local machine?

exec "$JAVACMD" "${JVM_OPTS[@]}" -classpath "$CLASSPATH" gov.loc.repository.bagit.driver.CommandLineBagDriver "$@"

Any suggestions for solving? I need to make bags from the command line, using the multi-part bag splitting feature of the CLI. Thanks,

Ignore "hidden" files

While the bagit spec states that all payload files MUST be listed in at least one manifest file, it can be understood that hidden files, i.e. files that are not user generated and that are not displayed or listed by default by the operating system should not be considered as payload files.

The concept of hidden file was created to prevent from system generated utility files from interfering with user application flows. By definition, Bagit is a user-level application as it intends to package and verify user-generated content in the context of content archival.

While including those system-generated files (.DS_Store, thumbnails...) causes insurmountable difficulties to the user, it is not demonstrated that there is a single valid scenario where an archivist would WANT or NEED to include those files in a bag.

Following the reasoning above, only considering non-hidden files, or files that are listed by default by the current file system (e.g. files listed by 'ls' or DIR command, visible in Finder or Explorer...) in the packaging and verification processes, fully implements the specification, and no amendment to the specification is needed.

How to change MD5 checksumming to SHA1

Using version 4.11 I can create Bags, but these have MD5 checksumming.
I want to change it to SHA1, but I have no clue how to do that.
Can you give some example code.

HolePuncherImplTest fails in OracleJDK8 Early Access build

While working on fixing the javadoc errors that are preventing adding JDK8 to the TravisCI build, I have been getting the following test failure:

HolePuncherImplTest.testMakeHoleyWithSlash:81 expected:<data/[dir2/dir3/test5].txt> but was:<data/[test2].txt>

My Java versions while getting the failure are (Build-124):

$ java -version
java version "1.8.0-ea"
Java(TM) SE Runtime Environment (build 1.8.0-ea-b124)
Java HotSpot(TM) 64-Bit Server VM (build 25.0-b66, mixed mode)
$ javac -version
javac 1.8.0-ea

On the surface it appears as though the ordering for FetchTxt is not preserved in JDK8, which is possible if there is a HashSet involved somewhere in the process, or the parsing of the FetchTxt file is not strictly linear for some other reason.

Issues running on larger files on Network-Attached Storage devices?

Hi, I'm wondering if there have been other reported issues with BagIt not running successfully on files that are stored on Network-Attached Storage and processed locally. The tool will work on some files but not others, and from my observation it seems to only work on smaller files (video files smaller than 1 GB are fine but anything larger fails to run).

Here is the full log report:

2016-02-11 12:50:13,377 [main] INFO CommandLineBagDriver : Performing operation: baginplace
2016-02-11 12:50:13,390 [main] INFO PreBagImpl : Making a bag in place at /Users/joeyheinen/Library/home-1/University Photograph Media/UPM_0463/Preservation Master
2016-02-11 12:50:13,423 [main] ERROR CommandLineBagDriver : Error: java.io.FileNotFoundException: Source '/Users/joeyheinen/Library/home-1/University Photograph Media/UPM_0463/Preservation Master/..DS_Store' does not exist
java.lang.RuntimeException: java.io.FileNotFoundException: Source '/Users/joeyheinen/Library/home-1/University Photograph Media/UPM_0463/Preservation Master/.
.DS_Store' does not exist
at gov.loc.repository.bagit.impl.PreBagImpl.makeBagInPlace(PreBagImpl.java:100)
at gov.loc.repository.bagit.driver.CommandLineBagDriver.performOperation(CommandLineBagDriver.java:852)
at gov.loc.repository.bagit.driver.CommandLineBagDriver.execute(CommandLineBagDriver.java:411)
at gov.loc.repository.bagit.driver.CommandLineBagDriver.main(CommandLineBagDriver.java:153)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.codehaus.classworlds.Launcher.launchStandard(Launcher.java:353)
at org.codehaus.classworlds.Launcher.launch(Launcher.java:264)
at org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430)
at org.codehaus.classworlds.Launcher.main(Launcher.java:375)
Caused by: java.io.FileNotFoundException: Source '/Users/joeyheinen/Library/home-1/University Photograph Media/UPM_0463/Preservation Master/._.DS_Store' does not exist
at org.apache.commons.io.FileUtils.moveToDirectory(FileUtils.java:2169)
at gov.loc.repository.bagit.impl.PreBagImpl.makeBagInPlace(PreBagImpl.java:81)
... 11 more
2016-02-11 12:50:13,424 [main] INFO CommandLineBagDriver : Returning 2

clean up code

Current:

  • code coverage: ~71-72%
  • findbugs report: 65
  • PMD report (default): 90
  • PMD report (adding java-basic and java-braces): 263

Proposed:

  • code coverage: coverage increased for non-trivial code paths
  • findbugs report: 0
  • PMD repot: 0

Build failure - unable to delete file

mvn package is failing for me on Windows 7, with the following error

-------------------------------------------------------------------------------
Test set: gov.loc.repository.bagit.writer.impl.ZipBagWriterTest
-------------------------------------------------------------------------------
Tests run: 4, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0.062 sec <<< FAILURE! - in gov.loc.repository.bagit.writer.impl.ZipBagWriterTest
testOverwrite(gov.loc.repository.bagit.writer.impl.ZipBagWriterTest)  Time elapsed: 0.031 sec  <<< ERROR!
java.lang.RuntimeException: Error deleting C:\Users\MoMA\Documents\GitHub\bagit-java\target\unit-test-data\bags\foo.zip: Unable to delete file: C:\Users\MoMA\Documents\GitHub\bagit-java\target\unit-test-data\bags\foo.zip
    at gov.loc.repository.bagit.utilities.TempFileHelper.switchTemp(TempFileHelper.java:34)
    at gov.loc.repository.bagit.writer.impl.AbstractWriter.switchTemp(AbstractWriter.java:47)
    at gov.loc.repository.bagit.writer.impl.ZipWriter.endBag(ZipWriter.java:91)
    at gov.loc.repository.bagit.impl.AbstractBag.accept(AbstractBag.java:475)
    at gov.loc.repository.bagit.writer.impl.ZipWriter.write(ZipWriter.java:158)
    at gov.loc.repository.bagit.writer.impl.AbstractWriterTest.testOverwrite(AbstractWriterTest.java:97)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
    at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
    at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
    at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
    at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
    at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
    at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
Caused by: java.io.IOException: Unable to delete file: C:\Users\MoMA\Documents\GitHub\bagit-java\target\unit-test-data\bags\foo.zip
    at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279)
    at gov.loc.repository.bagit.utilities.TempFileHelper.switchTemp(TempFileHelper.java:30)
    at gov.loc.repository.bagit.writer.impl.AbstractWriter.switchTemp(AbstractWriter.java:47)
    at gov.loc.repository.bagit.writer.impl.ZipWriter.endBag(ZipWriter.java:91)
    at gov.loc.repository.bagit.impl.AbstractBag.accept(AbstractBag.java:475)
    at gov.loc.repository.bagit.writer.impl.ZipWriter.write(ZipWriter.java:158)
    at gov.loc.repository.bagit.writer.impl.AbstractWriterTest.testOverwrite(AbstractWriterTest.java:97)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
    at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
    at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
    at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
    at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
    at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
    at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)

testWriter(gov.loc.repository.bagit.writer.impl.ZipBagWriterTest)  Time elapsed: 0.015 sec  <<< ERROR!
java.lang.RuntimeException: Error deleting C:\Users\MoMA\Documents\GitHub\bagit-java\target\unit-test-data\bags\foo.zip: Unable to delete file: C:\Users\MoMA\Documents\GitHub\bagit-java\target\unit-test-data\bags\foo.zip
    at gov.loc.repository.bagit.utilities.TempFileHelper.switchTemp(TempFileHelper.java:34)
    at gov.loc.repository.bagit.writer.impl.AbstractWriter.switchTemp(AbstractWriter.java:47)
    at gov.loc.repository.bagit.writer.impl.ZipWriter.endBag(ZipWriter.java:91)
    at gov.loc.repository.bagit.impl.AbstractBag.accept(AbstractBag.java:475)
    at gov.loc.repository.bagit.writer.impl.ZipWriter.write(ZipWriter.java:158)
    at gov.loc.repository.bagit.writer.impl.AbstractWriterTest.testWriter(AbstractWriterTest.java:43)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
    at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
    at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
    at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
    at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
    at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
    at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
Caused by: java.io.IOException: Unable to delete file: C:\Users\MoMA\Documents\GitHub\bagit-java\target\unit-test-data\bags\foo.zip
    at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2279)
    at gov.loc.repository.bagit.utilities.TempFileHelper.switchTemp(TempFileHelper.java:30)
    at gov.loc.repository.bagit.writer.impl.AbstractWriter.switchTemp(AbstractWriter.java:47)
    at gov.loc.repository.bagit.writer.impl.ZipWriter.endBag(ZipWriter.java:91)
    at gov.loc.repository.bagit.impl.AbstractBag.accept(AbstractBag.java:475)
    at gov.loc.repository.bagit.writer.impl.ZipWriter.write(ZipWriter.java:158)
    at gov.loc.repository.bagit.writer.impl.AbstractWriterTest.testWriter(AbstractWriterTest.java:43)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
    at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
    at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
    at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
    at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
    at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
    at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)

Fetching of non-payload files should not be attempted

Scenario:

  1. Create a bag
  2. Add a fetch line to the fetch.txt that has a non-payload destination, e.g., metadata/file.txt
  3. Execute bag fillholey on the bag

Problem:

The tool reports a failure, but it fetches the file anyway. According to the BagIt-specs only payload files can be fetched. It seems the tools fetches the files and then tries to verify the checksum but, as it assumes that it dealing only with payload files, it only checks the payload manifests and the fails on the missing checksum (even if it is present in the tagmanifest).

Code example

I created a JUnit example to demonstrate the problem. mvn test will now of course fail. See the test code for what I would expect to happen.

I looked up how the command line driver implements fillholey. Depending on the parameters you pass to the fetch-function you get slightly different results.

Correct behavior?

So, what would be the correct behavior? I suppose the library could either ignore fetch lines aimed at non-payload files or fail on them. At any rate, I find the current results a bit confusing.

Proposed unit tests

Run verifypayloadmanifests on a bag with valid / invalid / no payload manifests. Should return true / false / error or false.
Create a zipped bag with an invalid bagit.txt file and run verifycomplete. Then try to delete the zip file. There should be no console output and no exception thrown. Repeat the same thing with a valid zipped bag.
Create a folder with empty directories, some that are truly empty and some that contain .keep files. Run bag in place. The truly empty directories should be ignored (will not be present in the resulting bag) and the ones containing .keep files should be part of the bag. No exceptions should be thrown.

Changing example: BagCreator.bagInPlace is static

The README says in "Create a bag from a folder using version 0.97":

BagCreator creator = new BagCreator();
Bag bag = creator.bagInPlace(folder, algorithm, includeHiddenFiles);

-> but BagCreator.bagInPlace seems to be static, so:

Bag bag = BagCreator.bagInPlace(folder, algorithm, includeHiddenFiles);

SHA-1 manifest file format does not work with shasum

When submitting an issue please include:

Running 5.0.0-BETA
OS X 10.11.5

Please format it in the given when then style

For example (from link above):

Given

  • I have a manifest-SHA1.txt file of N checksums

When

  • I want to verify the manifest file using shasum directly

Then

  • I should be able to without re-formatting the lines

We are using this library to develop an import/export feature using BagIt bags. However in the test implementation the manifest-SHA1.txt that is generated cannot be parsed by OSX's shasum command.

For example the file appears as follows.

199DD5F71C1610D99D0E163C9E665FC99ECED911 data/rest/Department3.University0.edu/AssociateProfessor6/Publication10.jsonld
53EBFA7C03515FA0ABC674F8C6CEDE9619CE2E26 data/rest/Department3.University0.edu/GraduateCourse13.jsonld
E99A10E72C10839C5B4CB9D7772575450481E8AB data/rest/Department3.University0.edu/AssociateProfessor7/Publication14.jsonld
62CCBE962497D11A2A333793AB0EADDB918A6EF1 data/rest/Department3.University0.edu/FullProfessor3/Publication16/Proxy.jsonld

However

> shasum -a1 -c manifest-SHA1.txt
shasum: manifest-SHA1.txt: no properly formatted SHA1 checksum lines found

I can't find a standard on what the format of this a SHA-1 file is supposed to be, but it seems like there is supposed to be an asterisk before the filename. In the BagIt spec that has been specified as optional, but if I add either an additional space or an asterisk before the file path then the lines are read perfectly.

ie.
This fails to parse

199DD5F71C1610D99D0E163C9E665FC99ECED911 data/rest/Department3.University0.edu/AssociateProfessor6/Publication10.jsonld

But this

199DD5F71C1610D99D0E163C9E665FC99ECED911 *data/rest/Department3.University0.edu/AssociateProfessor6/Publication10.jsonld

or this

199DD5F71C1610D99D0E163C9E665FC99ECED911  data/rest/Department3.University0.edu/AssociateProfessor6/Publication10.jsonld

is readable as valid formatted SHA-1 checksum lines

normalize filename paths

as per https://github.com/loc-rdc/bagitspec/blob/bagit1.0/bagit.xml#L939-L957

  • Implementations SHOULD discourage the creation of bags containing files which differ only in case.
  • Implementations MUST prevent the creation of bags containing files which differ only in normalization form.
  • BagIt implementations SHOULD tolerate differences in normalization form by comparing both the list of filesystem and manifest names after applying the same normalization form to both.
  • Implementations SHOULD issue a warning when multiple manifests are present which differ only in case or normalization form.

needs to be implemented

Make hashing more efficient

Currently the bagit-java library re-reads a file for each hash type. This is inefficient, and unnecessary since they could be calculated in parallel.

Release version 4.10.0

We need the upgraded commons compress library for the next CTS release 3.5. Please create an official build of 4.10 and deploy it to our maven repository so CTS can use it as a dependency in 3.5. Thanks.

bagit style warnings

The bagit specification is very flexible, leading to some cases where it is technically valid but should be discouraged. In those cases it would be nice to let the user know that they shouldn't really being doing that.

Therefore I propose that the bagit-java library return warnings whenever possible for these types of cases. For example, when doing BagValidator.validate(bag) instead of returning void it would be a lot nicer to return a list of warnings.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.