Giter VIP home page Giter VIP logo

bhl_aspace_archive_it's Introduction

BHL ArchivesSpace/Archive-It Integration

An ArchivesSpace plugin for importing Archive-It seeds. Imported seeds generate an archival object that is associated with a resource corresponding to the seed's Archive-It collection. A digital object instance with a File URI corresponding to that seed URL's Wayback Machine calendar page is also created and associated with the archival object. The archival object can then be associated with a parent component that represent's the website as a distinct intellectual entity separate from the website's URL at a given point in time. The plugin also adds support for exporting website-level archival objects as MARC XML.

Directory Structure

backend\
    controllers\
        archive_it.rb
    job_runners\
        archive_it_collection_importer.rb
    model\
        lib\
            archive_it_export.rb
            archive_it_marc_model.rb
            archive_it_marc_serializer.rb
        archive_it.rb
    plugin_init.rb
frontend\
    assets\
        add_archive_it_link.js
    controllers\
        archive_it_controller.rb
        archive_it_collection_map_controller.rb
    locales\
        en.yml
    models\
        archive_it.rb
    views\
        archive_it\
            index.html.erb
        archive_it_collection_map\
            index.html.erb
        archive_it_import_job\
            _form.html.erb
            _show.html.erb
        layout_head.html.erb
    plugin_init.rb
    routes.rb
schemas\
    archive_it_import_job.rb
config.yml

Data Model

This plugin assumes a data model in which each Archive-It collection corresponds to an ArchivesSpace resource, each website corresponds to an archival object associated as a direct child-level component of a resource, and each Archive-It seed corresponds to an archival object associated as a direct child-level component of a website-level archival object. The site-level archival object contains descriptive metadata about the website (creator, subjects, abstract) and the seed-level objects contain capture dates and a digital object instance linking to the URL in the Wayback Machine.

Resource
    Website 1
        Seed 1
          Digital Object Instance
        Seed 2
          Digital Object Instance
    Website 2
        Seed 3
          Digital Object Instance

Usage

To enable this plugin, clone this repository to your ArchivesSpace installation's plugins/ directory and add bhl_aspace_archive_it to AppConfig[:plugins] in your ArchivesSpace config.rb.

To import a new seed URL into ArchivesSpace, navigate to the repository settings menu (gear icon) in ArchivesSpace and select "Archive-It Import" from the "Plug-ins" drop down menu. You will be taken to the following screen in ArchivesSpace:

Archive-It Import Screen

Copy and paste a seed URL from the Archive-It administrative interface (note: this is the URL of the Archive-It seed, not the URL of the website) and click the "Import" button. You will be redirected to a new archival object associated with the resource as configured in the above mapping. From there, you may drag and drop the archival object to associate the seed with existing site-level description or create new site-level description with which to associate the seed.

How it Works

This plugin takes advantage of many of the documented ArchivesSpace plugin mechanics. Most of the functionality of this plugin is implemented in the frontend directory. The Archive-It import screen is implemented in frontend/views/archive_it/index.html.erb. The import functionality of this page is defined in an ArchiveItController class (frontend/controllers/archive_it_controller.rb) and an ArchiveItImporter class (frontend/models/archive_it.rb). The ArchiveItController takes the seed URL submitted on the Archive-It import screen and hands it off to the ArchiveItImporter, which uses the Archive-It Partner Metadata API to get basic metadata about the seed (site URL, Archive-It collection ID) and then creates an ArchivesSpace archival object as described above. The ArchiveItController then redirects the user to the new archival object.

The plugin also implements several new ArchivesSpace API endpoints in backend/controllers/archive_it.rb. These endpoints use ArchiveIt and ArchiveItExport classes defined in backend/model/archive_it.rb and backend/model/lib/archive_it_export.rb, respectively. These endpoints include:

/repositories/:repo_id/archive_it/marc_candidates

This endpoint takes a repository id (:repo_id) and calls the get_marc_candidates function from the ArchiveIt class, which returns a list of all site-level archival objects.

/repositories/:repo_id/archive_it/archive_it_collections

This endpoint takes a repository id (:repo_id) and calls the get_archive_it_collection_map from the ArchiveIt class, which returns the collection map configured in AppConfig[:archive_it] above.

/repositories/:repo_id/archive_it/archive_it_marc/:id.xml

This endpoint takes a repository id (:repo_id) and archival object id (:id) and calls the generate_archive_it_marc function from the ArchiveItExport class, which returns a MARC XML representation of the archival object. The MARC XML mappings are configured in backend/model/lib/archive_it_marc_model.rb and backend/model/lib/archive_it_marc_serializer.rb.

bhl_aspace_archive_it's People

Contributors

eckardm avatar emcolonm avatar gii2000 avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

gii2000

bhl_aspace_archive_it's Issues

Archive-IT Seed Import Failure

We attempted to install the plugin according to the instructions; but when we attempted to import a seed we received a 500 error:

I, [2020-03-02T10:38:59.050923 #27911]  INFO -- : Processing by ArchiveItController#import as HTML
I, [2020-03-02T10:38:59.051085 #27911]  INFO -- :   Parameters: {"utf8"=>"โœ“", "authenticity_token"=>"[REDACTED]", "seed_url"=>"https://partner.archive-it.org/755/collections/13488/seeds/2145033"}
I, [2020-03-02T10:39:59.097808 #27911]  INFO -- : Completed 500 Internal Server Error in 60046ms
F, [2020-03-02T10:39:59.102558 #27911] FATAL -- :   
F, [2020-03-02T10:39:59.102727 #27911] FATAL -- : Net::OpenTimeout (Failed to open TCP connection to partner.archive-it.org:443 (execution expired)):
F, [2020-03-02T10:39:59.102874 #27911] FATAL -- :   
F, [2020-03-02T10:39:59.103381 #27911] FATAL -- : org/jruby/ext/socket/RubyTCPSocket.java:112:in `initialize'
org/jruby/RubyIO.java:1137:in `open'
uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:885:in `block in connect'
org/jruby/ext/timeout/Timeout.java:149:in `timeout'
uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:883:in `connect'
uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:868:in `do_start'
uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:857:in `start'
uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:585:in `start'
uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:480:in `get_response'
uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/net/http.rb:457:in `get'
/data/archivesspace/plugins/bhl_aspace_archive_it/frontend/models/archive_it.rb:17:in `get_seed_metadata'
/data/archivesspace/plugins/bhl_aspace_archive_it/frontend/models/archive_it.rb:35:in `create_archival_object'
/data/archivesspace/plugins/bhl_aspace_archive_it/frontend/controllers/archive_it_controller.rb:21:in `import'
actionpack (5.0.1) lib/action_controller/metal/basic_implicit_render.rb:4:in `send_action'
actionpack (5.0.1) lib/abstract_controller/base.rb:188:in `process_action'
actionpack (5.0.1) lib/action_controller/metal/rendering.rb:30:in `process_action'
actionpack (5.0.1) lib/abstract_controller/callbacks.rb:20:in `block in process_action'
activesupport (5.0.1) lib/active_support/callbacks.rb:126:in `call'
activesupport (5.0.1) lib/active_support/callbacks.rb:506:in `block in compile'
activesupport (5.0.1) lib/active_support/callbacks.rb:455:in `call'
activesupport (5.0.1) lib/active_support/callbacks.rb:101:in `__run_callbacks__'
activesupport (5.0.1) lib/active_support/callbacks.rb:750:in `_run_process_action_callbacks'
activesupport (5.0.1) lib/active_support/callbacks.rb:90:in `run_callbacks'
actionpack (5.0.1) lib/abstract_controller/callbacks.rb:19:in `process_action'
actionpack (5.0.1) lib/action_controller/metal/rescue.rb:20:in `process_action'
actionpack (5.0.1) lib/action_controller/metal/instrumentation.rb:36:in `block in process_action'
activesupport (5.0.1) lib/active_support/notifications.rb:164:in `block in instrument'
activesupport (5.0.1) lib/active_support/notifications/instrumenter.rb:21:in `instrument'
activesupport (5.0.1) lib/active_support/notifications.rb:164:in `instrument'
actionpack (5.0.1) lib/action_controller/metal/instrumentation.rb:30:in `process_action'
actionpack (5.0.1) lib/action_controller/metal/params_wrapper.rb:248:in `process_action'
actionpack (5.0.1) lib/abstract_controller/base.rb:126:in `process'
actionview (5.0.1) lib/action_view/rendering.rb:30:in `process'
actionpack (5.0.1) lib/action_controller/metal.rb:190:in `dispatch'
actionpack (5.0.1) lib/action_controller/metal.rb:262:in `dispatch'
actionpack (5.0.1) lib/action_dispatch/routing/route_set.rb:50:in `dispatch'
actionpack (5.0.1) lib/action_dispatch/routing/route_set.rb:32:in `serve'
actionpack (5.0.1) lib/action_dispatch/journey/router.rb:39:in `block in serve'
org/jruby/RubyArray.java:1734:in `each'
actionpack (5.0.1) lib/action_dispatch/journey/router.rb:26:in `serve'
actionpack (5.0.1) lib/action_dispatch/routing/route_set.rb:725:in `call'
rack (2.0.5) lib/rack/etag.rb:25:in `call'
rack (2.0.5) lib/rack/conditional_get.rb:38:in `call'
rack (2.0.5) lib/rack/head.rb:12:in `call'
rack (2.0.5) lib/rack/session/abstract/id.rb:232:in `context'
rack (2.0.5) lib/rack/session/abstract/id.rb:226:in `call'
actionpack (5.0.1) lib/action_dispatch/middleware/cookies.rb:613:in `call'
actionpack (5.0.1) lib/action_dispatch/middleware/callbacks.rb:38:in `block in call'
activesupport (5.0.1) lib/active_support/callbacks.rb:97:in `__run_callbacks__'
activesupport (5.0.1) lib/active_support/callbacks.rb:750:in `_run_call_callbacks'
activesupport (5.0.1) lib/active_support/callbacks.rb:90:in `run_callbacks'
actionpack (5.0.1) lib/action_dispatch/middleware/callbacks.rb:36:in `call'
actionpack (5.0.1) lib/action_dispatch/middleware/remote_ip.rb:79:in `call'
actionpack (5.0.1) lib/action_dispatch/middleware/debug_exceptions.rb:49:in `call'
actionpack (5.0.1) lib/action_dispatch/middleware/show_exceptions.rb:31:in `call'
railties (5.0.1) lib/rails/rack/logger.rb:36:in `call_app'
railties (5.0.1) lib/rails/rack/logger.rb:24:in `block in call'
activesupport (5.0.1) lib/active_support/tagged_logging.rb:69:in `block in tagged'
activesupport (5.0.1) lib/active_support/tagged_logging.rb:26:in `tagged'
activesupport (5.0.1) lib/active_support/tagged_logging.rb:69:in `tagged'
railties (5.0.1) lib/rails/rack/logger.rb:24:in `call'
actionpack (5.0.1) lib/action_dispatch/middleware/request_id.rb:24:in `call'
rack (2.0.5) lib/rack/method_override.rb:22:in `call'
rack (2.0.5) lib/rack/runtime.rb:22:in `call'
activesupport (5.0.1) lib/active_support/cache/strategy/local_cache_middleware.rb:28:in `call'
actionpack (5.0.1) lib/action_dispatch/middleware/executor.rb:12:in `call'
actionpack (5.0.1) lib/action_dispatch/middleware/static.rb:136:in `call'
rack (2.0.5) lib/rack/sendfile.rb:111:in `call'
railties (5.0.1) lib/rails/engine.rb:522:in `call'
uri:classloader:/rack/handler/servlet.rb:22:in `call'

Visiting the seed manually via a browser redirects us to a login page, which makes sense. This prompted me to realize that the instructions don't indicate where one should include Archive-IT login information. Is this part of the archivesspace config.rb file? I don't see any configuration page in the Web UI...

Check for duplicates

The plugin should be check ArchivesSpace to determine if a given seed has already been imported to prevent duplicates.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.