Giter VIP home page Giter VIP logo

islandora_compound_batch's People

Contributors

egesu avatar garrettarm avatar marcusbarnes avatar mjordan avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

islandora_compound_batch's Issues

Create tagged releases

I was just watching a demo where islandora_compound_batch was used as part of Circle CI script. It would be good to start creating tagged releases that can be used within their devops scripts, etc. This would avoid situations where merged pull-requests create changes that break these scripts.

Provide way to override the extenstion-to-content-model mapping

This feature is listed in the README's 'todo'. I propose adding a drush option --content_models (for consistency with Islandora Batch) that would take as its value a semicolon-separated list of extension => content model pairs, with a double colon (::) as the delimiter within pairs, e.g.,

--content_models=pdf::islandora:foo
--content_models=pdf::islandora:foo;jpg::islandora:barCModel

If the --content_models parameter is present, its mappings would have precedence over the mappings defined in $this->extensionToContentModelMap.

If this sounds like a viable plan, I'd be happy to take a stab at it.

Allow ingesting of child objects that only have metadata

Coming out of #14, we should support the creation of child objects that only have a MODS.xml or MARC.xml file and no payload file (PDF, TIFF, etc.). Currently, islandora_compound_batch determines the child's content model based on the extension of its payload file; additionally, if the payload file is absent, the child is not created. We will need to provide a drush option to let users indicate which content model to assign to metadata-only child objects.

Tagging @egesu to make sure this issue represents the intended use case.

Consider natural language sort order in create_structure_files

I'm building some compounds and trying this out again. I noticed that because I use a numbering scheme to generate my sub-directories they fall out of order if there are more than 9 items.
ie.

<islandora_compound_object title="pa_aar">
  <child content="pa_aar/1"/>
  <child content="pa_aar/10"/>
  <child content="pa_aar/2"/>
  <child content="pa_aar/3"/>
  <child content="pa_aar/4"/>
  <child content="pa_aar/5"/>
  <child content="pa_aar/6"/>
  <child content="pa_aar/7"/>
  <child content="pa_aar/8"/>
  <child content="pa_aar/9"/>
</islandora_compound_object>

Adding a natural language sort (or perhaps a configurable sort option) would help.

I just added this line

sort($stuffindirectory, SORT_NATURAL);

below here which results in my desired ordering.

<islandora_compound_object title="pa_aar">
  <child content="pa_aar/1"/>
  <child content="pa_aar/2"/>
  <child content="pa_aar/3"/>
  <child content="pa_aar/4"/>
  <child content="pa_aar/5"/>
  <child content="pa_aar/6"/>
  <child content="pa_aar/7"/>
  <child content="pa_aar/8"/>
  <child content="pa_aar/9"/>
  <child content="pa_aar/10"/>
</islandora_compound_object>

Allow assignment of content model for parent objects

If you uncheck the option "Only allow compound objects to have child objects associated with them" at admin/islandora/solution_pack_config/compound_object (Administration > Islandora > Solution Pack Configuration > Compound Object Solution Pack), any object, regardless of its content model, can have children. Currently, the content model of the parent objects created by this module is hard coded at https://github.com/MarcusBarnes/islandora_compound_batch/blob/master/includes/object.inc#L175. It would be useful to all the user to pass in a --parent_content_model parameter to override this.

Add the ability to create OCR on ingest?

Other batch modules such as Islandora book batch have an option to produce OCR on appropriate datastreams (files) on batch ingest via islandora_ocr. Based on how you anticipate using Islandora Compound Batch in the near future, would this be a useful feature to add to the to-do? Please comment or give your reaction in order to vote and chime-in. Thank you.

Updates to README

The note "Note: --target applies to drush 6 and below, while --scan_target replaces this keyword in drush 7 and above." in the examples in the "OBJ extension to content model mappings" section is out of place and somewhat confusing. Also, we should indicated that the relationship assigned to children is the one that is configured in the Compound SP's "Child relationship predicate" (added in #29).

GUI hangs on line 75 of batch.form.inc

Fatal error: Class 'IslandoraCompoundBatch' not found in /opt/mounts/drupal/ldl/sites/all/modules/islandora_compound_batch/includes/batch.form.inc on line 75

$preprocessor = new IslandoraCompoundBatch($connection, $parameters);

The class IslandoraCompoundBatch is not defined. I'm seeing the previous commit of this line was:
$preprocessor = new IslandoraNewspaperBatch($connection, $parameters);
which referenced islandora_compound_batch.inc:
class IslandoraNewspaperBatch extends IslandoraScanBatch .....

The file islandora_compound_batch.inc was removed in later commits. Along with it, the class IslandoraNewspaperBatch (or IslandoraCompoundBatch) was eliminated.

The GUI won't work until there is an IslandoraCompoundBatch class defined.

Any advice on your reasoning before I try to craft my own IslandoraCompoundBatch class?

"Defer derivative generation during ingest" incompatible with islandora_compound_batch?

Use case:
I'd like to do all of my derivative generation before I submit my batch to Islandora so that I can accelerated ingest rates.

Problem:
When I check "Defer derivative generation during ingest" on /admin/islandora/configure and I create a batch set using islandora_compound_batch with the resulting batch when ingested contains empty objects only containing MODS, DC, and RELS-EXT. The objects don't even contain the TIF OBJ that was submitted!

If however when I uncheck "Defer derivative generation during ingest" on /admin/islandora/configure the resulting objects when ingested contains all of the appropriate datastreams, including the OBJ. My sense is though that those datastreams have been generated by Islandora and thus the versions that I pregenerated are not actually taken.

For both example cases above islandora_compound_batch was pointing at a directory full of object folders containing the appropriate datastreams with respective file names. Example:

./smith_ssc_324_digital_object_323
./smith_ssc_324_digital_object_323/structure.xml
./smith_ssc_324_digital_object_323/MODS.xml
./smith_ssc_324_digital_object_323/OCR.txt
./smith_ssc_324_digital_object_323/TN.jpg
./smith_ssc_324_digital_object_323/00001
./smith_ssc_324_digital_object_323/00001/JPG.jpg
./smith_ssc_324_digital_object_323/00001/JP2.jp2
./smith_ssc_324_digital_object_323/00001/MODS.xml
./smith_ssc_324_digital_object_323/00001/TN.jpg
./smith_ssc_324_digital_object_323/00001/OBJ.tif
./smith_ssc_324_digital_object_323/00002
./smith_ssc_324_digital_object_323/00002/JPG.jpg
./smith_ssc_324_digital_object_323/00002/JP2.jp2
./smith_ssc_324_digital_object_323/00002/MODS.xml
./smith_ssc_324_digital_object_323/00002/TN.jpg
./smith_ssc_324_digital_object_323/00002/OBJ.tif
./smith_ssc_324_digital_object_323/00003
./smith_ssc_324_digital_object_323/00003/JPG.jpg
./smith_ssc_324_digital_object_323/00003/JP2.jp2
./smith_ssc_324_digital_object_323/00003/MODS.xml
./smith_ssc_324_digital_object_323/00003/TN.jpg
./smith_ssc_324_digital_object_323/00003/OBJ.tif
...

Here are my exact commands:

drush -v --user=compass_admin islandora_compound_batch_preprocess --scan_target=/mnt/ingest/smith/compound-large-image-sample --namespace=test --parent=smith:test
drush -v --user=1 islandora_batch_ingest --ingest_set=774

I can send you a sample ingest directory if needed.

Possible desired outcomes:

  1. When I check "Defer derivative generation during ingest" take the supplied files and insert them into their respective datastreams (like the book object) especially the OBJ
  2. Add a configuration option for islandora_compound_batch like the book and newspaper solution packs that allows disabling for just the compound object type.

Assumptions:
I've only tried this with large image objects using TIF files as the OBJ. I'm assuming that this is an issue for other child object types.

Update Drush option names to work with latest drush versions

I found that I was unable to run drush icbp using later versions of drush; I'm on a pretty recent version, 9.0-dev.

Anyway, it turned out that using an option name other than target to indicate the source directory made my errors go away. Unfortunately, I made this fix long ago on our fork (which is somehow no longer linked), so I don't have error output or anything to help explain why I've made this change.

Maybe I'll just leave this here in case it can be useful: lsulibraries@422e2e0

Use value of Child relationship predicate setting rather than the default.

We have been assuming the the Child relationship predicate is the default 'isConstituentOf' in the addRelationshipsForChild method

public function addRelationshipsForChild($parent_pid, $sequence_number) {
. However, this value is configurable in the 'Child relationship predicate' setting on the compound object solution pack admin form admin/islandora/solution_pack_config/compound_object form. Rather than assuming that it will be the default 'isConstituentOf' (with the same caveats on that changing that setting in the compound object solution pack admin form after the fact), use the value of the 'Child relationship predicate' setting.

Thanks to @bseeger for spotting this (see #27 (comment)).

Children not being ingested

I'm using Compound Batch for the first time since the resolution of #2. Even though I've pulled in the latest code (I'm running at fde7203) and run drush devel-reinstall islandora_compound_batch so that the db gets updated, none of the children in my batch are being ingested. Only the parents are.

Here's the structure of the islandora_compound_batch table:

mysql> describe islandora_compound_batch;
+---------------------+---------------------+------+-----+---------+----------------+
| Field               | Type                | Null | Key | Default | Extra          |
+---------------------+---------------------+------+-----+---------+----------------+
| id                  | int(10) unsigned    | NO   | PRI | NULL    | auto_increment |
| child_content_value | text                | NO   |     | NULL    |                |
| object_id           | bigint(20) unsigned | NO   |     | 0       |                |
| object_xpath        | text                | NO   |     | NULL    |                |
| parent_xpath        | text                | NO   |     | NULL    |                |
| object_pid          | bigint(20) unsigned | NO   |     | 0       |                |
| parent_pid          | varchar(255)        | NO   |     |         |                |
| batch_id            | bigint(20) unsigned | NO   |     | 0       |                |
+---------------------+---------------------+------+-----+---------+----------------+
8 rows in set (0.00 sec)

Here's one of the compound object subdirectories in the input directory:

999/
├── 991
│   ├── MODS.xml
│   └── OBJ.jpg
├── 992
│   ├── MODS.xml
│   └── OBJ.jpg
├── 993
│   ├── MODS.xml
│   └── OBJ.jpg
├── 994
│   ├── MODS.xml
│   └── OBJ.jpg
├── 995
│   ├── MODS.xml
│   └── OBJ.jpg
├── 996
│   ├── MODS.xml
│   └── OBJ.jpg
├── 997
│   ├── MODS.xml
│   └── OBJ.jpg
├── 998
│   ├── MODS.xml
│   └── OBJ.jpg
├── MODS.xml
└── structure.xml

8 directories, 18 files

I've run create_structure_files.php over my input directory. Here's the strucutre.xml file for the input directory above:

<?xml version="1.0" encoding="utf-8"?>
<!--Islandora compound structure file used by the Compound Batch module. On batch ingest,
    'islandora_compound_object' elements become compound objects, and 'child' elements become their
    children. Files in directories named in child elements' 'content' attribute will be added as their
    datastreams. If 'islandora_compound_object' elements do not contain a MODS.xml file, the value of
    the 'title' attribute will be used as the parent's title/label.-->
<islandora_compound_object title="999">
  <child content="999/991"/>
  <child content="999/992"/>
  <child content="999/993"/>
  <child content="999/994"/>
  <child content="999/995"/>
  <child content="999/996"/>
  <child content="999/997"/>
  <child content="999/998"/>
</islandora_compound_object>

Anyone have any suggestions as to what I'm doing wrong? Won't somebody please think of the children? (Sorry, I couldn't resist that 😆)

Configurable CModel mapping?

It would be difficult to make this module 100% useful for all content models. For example, in this branch I added a mapping for 'txt' => 'islandora:sp_remoteMediaCModel', for my custom Remote Media content model. But it is conceivable that someone else might use txt for a different CModel.

Suggesting there could be an admin form where these mappings can be configured by the user and customized as needed to add more filetypes and CModels without hitting the code.

(Or if you're OK with my Remote Media update, I could just make a PR for it.)

Thumbnails ignored

My data directory has the following structure:

export/
└── S01E01
    ├── MODS.xml
    ├── TN.jpeg
    ├── episode
    │   ├── MODS.xml
    │   └── OBJ.mp3
    ├── structure.xml
    └── transcript
        ├── MODS.xml
        └── OBJ.pdf

I've provided a thumbnail in the S01E01 directory, but it is being completely ignored by the import process. I can add the thumbnail manually and things work as expected.

I'm using a stock Islandora 7.x-1.13 VM. The only additional module I've installed is islandora_compound_batch.

Something I am doing is wrong

@MarcusBarnes I was trying this out and I'm not sure what I am doing wrong, but I ended up with 200 objects instead a compound with 200 children.

I have run it twice, using the default --parent_relationship_pred and then setting it to --parent_relationship_pred=isConstituentOf. Same result after both ingests.

I have a directory structure like

/vagrant/test_compounds
     /compound_1
           /1
               OBJ.jpg
           /2
               OBJ.jpg
          ....
          /200
               OBJ.jpg
          MODS.xml
          structure.xml

I created the structure.xml by running
php create_structure_files.php /vagrant/test_compounds/

It looks like this

<?xml version="1.0" encoding="utf-8"?>
<!--Islandora compound structure file used by the Compound Batch module. On batch ingest,
    'islandora_compound_object' elements become compound objects, and 'child' elements become their
    children. Files in directories named in child elements' 'content' attribute will be added as their
    datastreams. If 'islandora_compound_object' elements do not contain a MODS.xml file, the value of
    the 'title' attribute will be used as the parent's title/label.-->
<islandora_compound_object title="compound_1">
  <parent title="compound_1/1"/>
  <parent title="compound_1/10"/>
  <parent title="compound_1/100"/>
  <parent title="compound_1/101"/>
  <parent title="compound_1/102"/>
  <parent title="compound_1/103"/>
 ....

Then I ran

drush -u 1 islandora_compound_batch_preprocess --namespace=islandora --parent='islandora:compound_collection' --target=/vagrant/test_compounds

and drush -u 1 ibi --ingest_set=<set id>

Then I tried

drush -u 1 islandora_compound_batch_preprocess --namespace=islandora --parent='islandora:compound_collection' --parent_relationship_pred=isConstituentOf --target=/vagrant/test_compounds

Same result, when I go into islandora:compound_collection there are two objects named MODS and 400 named OBJ. They are all compound objects.

This is obviously not the expected behaviour, what did I mess up?

Does islandora_compound_batch work as expected with Islandora 7.x-1.10?

Hi,

I'm using Islandora 7.x-1.10 and trying out this module. After ingest, I see the two parent level compound objects, but when I click on them, all I see is the MODS metadata - the image objects have not been associated with the compound objects. The images are not part of any collection, though they look like they were created correctly.

Have folks tried this module with v1.10?
There's a very good chance it's my lack of knowledge about how this works, but since I saw a github issue about whether it worked in v1.9 I thought I'd ask.

Thanks,
Bethany

Change default branch to 7.x?

By convention, Islandora modules use 7.x as their default branch. Any reason Islandora Compound Batch should use master?

Sample Batch Set of Compound Object Not Being Ingested properly

@CadenArmstrong has provided me with a sample batch set of compound compound objects that are not being ingested as expected. The sample adheres to the documentation in the README. The drush commands appear to work as expected, but when you visit the --parent object, no items appear. Viewing the object PIDs, you see all the items in batch, rather than seperate compound objects.

Possible source of problem to investigate:

  • Changes to the dependencies islandora_batch or islandora_compound_object (and their dependencies.
  • Folder and file names of the sample batch - try simplifying to eliminate folder or file name issue.

Double check Islandora versions where islandora_compound_batch is being used without issue against versions of modules in islandora_vagrant. Use test compound objects in islandora_vagrant to see if the issue is now appearing for batch sets that Marcus knows use to function as expected.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.