Giter VIP home page Giter VIP logo

wax_tasks's Introduction

wax_tasks 🐝

ci:test Depfu Gem Version Gem Downloads docs

Maintainability Test Coverage License

wax_tasks is gem-packaged set of Rake tasks for creating minimal exhibition sites with Wax.

It can be used to:


Getting Started

Prerequisites

You'll need Ruby >= 3.2 with bundler installed. Check your versions with:

$ ruby -v
  ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]

$ bundler -v
  Bundler version 2.4.16

To use the image derivative tasks, you will also need to have ImageMagick and Ghostscript installed and functional. You can check to see if you have ImageMagick by running:

$ convert -version
  Version: ImageMagick 7.1.1-12 Q16-HDRI aarch64 21239 https://imagemagick.org
  Copyright: (C) 1999 ImageMagick Studio LLC
  License: https://imagemagick.org/script/license.php
  Features: Cipher DPC HDRI Modules OpenMP(5.0)
  Delegates (built-in): bzlib fontconfig freetype gslib heic jng jp2 jpeg jxl lcms lqr ltdl lzma openexr png ps raw tiff webp xml zlib
  Compiler: gcc (4.2)

... and check Ghostscript with:

$ gs -version
  GPL Ghostscript 10.01.2 (2023-06-21)
  Copyright (C) 2023 Artifex Software, Inc.  All rights reserved.

Next, you'll need a Jekyll site. You can clone the minicomp/wax demo site or start a site from scratch with:

$ gem install jekyll
$ jekyll new wax && cd wax

Installation

Add wax_tasks to your Jekyll site's Gemfile:

gem 'wax_tasks'

... and install with bundler:

$ bundle install

Create a Rakefile with the following:

spec = Gem::Specification.find_by_name 'wax_tasks'
Dir.glob("#{spec.gem_dir}/lib/tasks/*.rake").each { |r| load r }

Usage

After following the installation instructions above, you will have access to the Rake tasks in your shell by running $ bundle exec rake wax:taskname in the root directory of your Jekyll site. To see the available tasks, run

$ bundle exec rake --tasks

Sample site _config.yml file:

# basic settings
title: Wax.
description: a jekyll theme for minimal exhibitions
url: 'https://minicomp.github.io'
baseurl: '/wax'

# build settings
permalink: pretty # optional, creates `/page/` link instead of `page.html` links

# wax collection settings
collections:
  objects: # the collection name
    layout: 'iiif-image-page'
    output: true # this must be true for your .md pages to be built to html!
    metadata:
      source: 'objects.csv' # path to the metadata file, must be within '_data'
    images:
      source 'source_images/objects' # path to the directory of source images, must be within '_data'

# wax search index settings
search:
  main:
    index: 'js/lunr-index.json' # where the index will be generated
    collections: # the collections to index
      objects:
        content: false # whether or not to index the markdown page content (below the YAML)
        fields: # the metadata fields to index
          - 'label'
          - 'artist'
          - 'location'
          - 'object_type'

The above example includes a single collection objects that comprises:

  1. a CSV metadata:source file (objects.csv), and
  2. a images:source directory of image and pdf files.

For more information on configuring Jekyll collections for wax_tasks, check out the minicomp/wax wiki and https://jekyllrb.com/docs/collections/.

Running the tasks

wax:pages

Takes a CSV, JSON, or YAML file of collection metadata and generates a markdown page for each record to a directory using a specified layout. Read More.

$ bundle exec rake wax:pages collection-name

wax:search

Generates a client-side JSON search index of your site for use with ElasticLunr.js. Read More.

$ bundle exec rake wax:search search-name

wax:derivatives:simple

Takes a local directory of images and pdf files and generates a few image derivatives (i.e., 'thumbnail' 250w and 'full' 1140w) for Jekyll layouts and includes to use. Read More.

$ bundle exec rake wax:derivatives:iiif collection-name

wax:derivatives:iiif

Takes a local directory of images and pdf files and generates tiles and data that work with a IIIF compliant image viewer like OpenSeaDragon, Mirador, or Leaflet IIIF. Read More.

$ bundle exec rake wax:derivatives:iiif collection-name

wax:clobber

Destroys (or "clobbers") wax-generated files, i.e., pages generated from wax:pagemaster, derivatives generated from wax:derivatives, and search indexes generated with wax:search so you can start from scratch.

This task does not touch your source metadata or source image files! Instead, it simply clears a path for you to regenerate your collection materials in case you add/edit source materials.

$ bundle exec rake wax:clobber collection-name

Contributing

Fork/clone the repository. After making code changes, run the tests ($ bundle exec rubocop and $ bundle exec rspec) before submitting a pull request. You can enable verbose tests with $ DEBUG=true bundle exec rspec.

License

The gem is available as open source under the terms of the MIT License.

wax_tasks's People

Contributors

cassws avatar dependabot-support avatar depfu[bot] avatar mnyrop avatar pbinkley avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

wax_tasks's Issues

Simple derivative generation is slow; still makes images in memory even if file path exists.

This would require a decent amount of work to refactor but would be worth it.

Current behavior:

Skips writing derivative (in memory) to disk if target file already exists

Desired behavior:

Should skip creating derivative at all if target file already exists.
This means collections need to know all* of their target derivative paths in advance.
If successful, derivative tasks would take the same amount of time the first time they are run, but would be exponentially quicker when rerun.

Integrate wax tasks w/jekyll build or something.

Is your feature request related to a problem? Please describe.

This is probably quite a difficult thing to do. But it is quite hard to continuously update a wax site right now. If a CSV or exhibit is edited, it's hard to know if those changes are getting pushed through to the final version, so it often involves:

  1. Shutting down the jekyll server
  2. Running wax:clobber (maybe--because it's not clear from the user end whether, e.g., minor changes to images will be passed through into the manifests or not) which requires a rebuild of all the derivative files.
  3. Running wax:pages
  4. Running wax:search (although I skip this in iterative development)
  5. Restarting Jekyll with a clean rebuild.

This takes 30-45 minutes on our collection.

Describe the solution you'd like

In an ideal world, I want to be able to run bundle exec wax serve which wraps jekyll serve with continuous updating of changed files

Describe alternatives you've considered

  1. Clear guarantees in the documentation about which rebuild steps are actually necessary when which files are edited. (Like--under what circumstances is it safe to keep the derivatives from a previous build?)
  2. Making my own Makefile to handle the rebuilding, since I'm not a ruby dev and don't really know how to plumb the interior of the rakefiles
  3. Fine-tuning the clobber command so that you can delete everything except the IIIF manifests. This alone would be a big help, especially in combination with 1.
  4. Porting the Wax themes to Svelte-kit instead of Jekyll to get instant hot-module-reloading based on JSON description in dev, with pretty-much identical static HTML produced only at the time of a publication build and optionally rebuilt client-side in-browser. (I have indeed seriously considered this--not saying that you should, but you asked!)

Additional context
This flurry of issues is being caused by my having borked up my open sea dragon displays somehow, and it being extremely slow to trace the flow of changes in the upstream metadata to the user-facing HTML. Nothing in there is Wax's fault, I'm pretty confident, but it would still be nice for the platform to be debuggable.

Fail on bad metadata

Wax has constraints on metadata that users might not immediately notice--e.g., that ids cannot include a period or uppercase letters, and that certain fields have special meanings.

If these rules are violated, a site will build but break in unpredictable ways. Would you consider instead throwing an error that explains what needs to happen differently? I suspect this will save users time by helping them identify where things have gone wrong.

Add error message for cases where number of images and number of rows don't match

Is your feature request related to a problem? Please describe.
This happened twice to me already in completely different contexts, so I'm guessing it will be a relatively common problem. Terminal spits out a nil value error. We might be able to be more specific about the nature of the error so producers can quickly address.

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Remove layout population from wax:pages

Is your feature request related to a problem? Please describe.
Putting the layout in the front matter of a wax page is unnecessary since it is required to put in the _config.yml file and Jekyll will auto populate the layout from the _config.yml file. If you change it in the _config.yml file you have to reprocess all your pages file, which is unnecessary extra work for the user.

Describe the solution you'd like
Remove the piece of code that puts the layout in the front matter of every wax page.

Describe alternatives you've considered
There could be a setting that will put it in the front matter, but this should not be the default.

Additional context
Line 15
https://github.com/minicomp/wax/blame/e4dcaa10a28b427cd502ac960b8943952805ed7d/_qatar/obj10.md

Create .wax_cache with JSON state information

  • Track item types e.g.,
    • existence/list/md5 of source image asset(s)
    • existence/list/md5 of IIIF derivatives
    • existence/list/md5 of image variants
  • Keep wax-generated fields namespaced & separate from source metadata file
  • Do not .gitignore by default??

Create Rake Task for creating PDF derivatives

Is your feature request related to a problem? Please describe.
I am trying to hook WAX up with GitHub actions so you can create derivatives via GitHub and don't have to download the repo. The problem I am running into is splitting PDFs doesn't work on GitHub actions using an Ubuntu machine and if you use a Mac to create the derivatives it takes twice as long to create the derivatives. What I would like to do is split the images on a Mac and then create the derivatives on an Ubuntu machine. I thought I would get away with running a ruby script that imported the wax_tasks library and run the script but it looks like all the methods are made private or something because they are not showing up as a method I can run using the library.

Describe the solution you'd like
A rake task to create the image derivatives.

Describe alternatives you've considered
Making the methods public to use when importing the library to ruby.

Additional context
n/a

Derivatives:simple should support multi-page items

  • use openseadradon for simple derivatives (see: minicomp/wax#56)
  • derivatives:simple should create an array of images to add to collection page front matter IF the item is paged
  • derivatives:simple should use the first image for main thumbnail and full banner πŸ›

Edit: These changes in wax_tasks would be best written into a v2.0 that makes use of tmp and state-tracking json files so that the array of paged derivatives & their variants can be captured without writing that info back into a cannonical CSV.

Time to rip the band-aid off! 🩹⏲️

Search Generates InvalidConfig error

Describe the bug

When running the rake wax:search qatar task, rake exists with an error:

rake aborted!
WaxTasks::Error::InvalidConfig:
...

To Reproduce
Steps to reproduce the behavior:

  1. Clone the repository
  2. Run bundle install
  3. Run bundle exec rake wax:search qatar
  4. See error

Expected behavior
The /search/index.json file to be generated.

Error Trace Output

$bundle exec rake wax:search qatar --trace
rake aborted!
WaxTasks::Error::InvalidConfig:
/Users/<user>/.rvm/gems/ruby-2.6.3/gems/wax_tasks-1.0.2/lib/wax_tasks/config.rb:54:in `search'
/Users/<user>/.rvm/gems/ruby-2.6.3/gems/wax_tasks-1.0.2/lib/wax_tasks/site.rb:40:in `generate_static_search'
/Users/<user>/.rvm/gems/ruby-2.6.3/gems/wax_tasks-1.0.2/lib/tasks/search.rake:12:in `block (3 levels) in <top (required)>'
/Users/<user>/.rvm/gems/ruby-2.6.3/gems/wax_tasks-1.0.2/lib/tasks/search.rake:12:in `each'
/Users/<user>/.rvm/gems/ruby-2.6.3/gems/wax_tasks-1.0.2/lib/tasks/search.rake:12:in `block (2 levels) in <top (required)>'
/Users/<user>/.rvm/gems/ruby-2.6.3/gems/rake-12.3.3/lib/rake/task.rb:273:in `block in execute'
/Users/<user>/.rvm/gems/ruby-2.6.3/gems/rake-12.3.3/lib/rake/task.rb:273:in `each'
/Users/<user>/.rvm/gems/ruby-2.6.3/gems/rake-12.3.3/lib/rake/task.rb:273:in `execute'
/Users/<user>/.rvm/gems/ruby-2.6.3/gems/rake-12.3.3/lib/rake/task.rb:214:in `block in invoke_with_call_chain'
/Users/<user>/.rvm/rubies/ruby-2.6.3/lib/ruby/2.6.0/monitor.rb:230:in `mon_synchronize'
/Users/<user>/.rvm/gems/ruby-2.6.3/gems/rake-12.3.3/lib/rake/task.rb:194:in `invoke_with_call_chain'
/Users/<user>/.rvm/gems/ruby-2.6.3/gems/rake-12.3.3/lib/rake/task.rb:183:in `invoke'
/Users/<user>/.rvm/gems/ruby-2.6.3/gems/rake-12.3.3/lib/rake/application.rb:160:in `invoke_task'
/Users/<user>/.rvm/gems/ruby-2.6.3/gems/rake-12.3.3/lib/rake/application.rb:116:in `block (2 levels) in top_level'
/Users/<user>/.rvm/gems/ruby-2.6.3/gems/rake-12.3.3/lib/rake/application.rb:116:in `each'
/Users/<user>/.rvm/gems/ruby-2.6.3/gems/rake-12.3.3/lib/rake/application.rb:116:in `block in top_level'
/Users/<user>/.rvm/gems/ruby-2.6.3/gems/rake-12.3.3/lib/rake/application.rb:125:in `run_with_threads'
/Users/<user>/.rvm/gems/ruby-2.6.3/gems/rake-12.3.3/lib/rake/application.rb:110:in `top_level'
/Users/<user>/.rvm/gems/ruby-2.6.3/gems/rake-12.3.3/lib/rake/application.rb:83:in `block in run'
/Users/<user>/.rvm/gems/ruby-2.6.3/gems/rake-12.3.3/lib/rake/application.rb:186:in `standard_exception_handling'
/Users/<user>/.rvm/gems/ruby-2.6.3/gems/rake-12.3.3/lib/rake/application.rb:80:in `run'
/Users/<user>/.rvm/gems/ruby-2.6.3/gems/rake-12.3.3/exe/rake:27:in `<top (required)>'
/Users/<user>/.rvm/gems/ruby-2.6.3/bin/rake:23:in `load'
/Users/<user>/.rvm/gems/ruby-2.6.3/bin/rake:23:in `<main>'
/Users/<user>/.rvm/gems/ruby-2.6.3/bin/ruby_executable_hooks:24:in `eval'
/Users/<user>/.rvm/gems/ruby-2.6.3/bin/ruby_executable_hooks:24:in `<main>'
Tasks: TOP => wax:search

Additional leading slashes between hostname and IIIF resources

Describe the bug
There are are additional slashes between the hostname and the path for IIIF resources in manifests, etc.

To Reproduce
Steps to reproduce the behavior:

  1. Check out the repo (a minimal wax exhibition): https://github.com/anarchivist/chandlery.git
  2. Build using the following:
bundle exec rake wax:derivatives:iiif
bundle exec rake wax:pagemaster
bundle exec jekyll build
  1. Inspect the output

Expected behavior
Something likehttps://example.com/img/derivatives/iiif/MKP_392_0032/manifest.json should appear instead of https://example.com//img/derivatives/iiif/MKP_392_0032/manifest.json

Change console log while generating derivatives

Describe the bug
While generating, the message should indicate that the process is happening, rather than wait for it to finish. This is important for big images.

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Smartphone (please complete the following information):

  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. stock browser, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Create derivate csvs for collections instead of editing user-generated ones.

Is your feature request related to a problem? Please describe.

When debugging an issue in a site, it's important to be able to start over from a clean slate. 'wax:clobber' generally allows this, but as @mnyrop notes in the issue creating it, #46, it doesn't eliminate columns that have been added to _data/{collection}.csv. So if you want to actually do a clean install, you need to maintain a backup copy of your csv and start copying it into _data.

Describe the solution you'd like
That wax overwrites a user-created file with an altered version makes me squeamish. I don't know Jekyll, but every static system I've used maintains a bright line between user generated files and program-generated ones. Like, you would never script something that adds new lines to _config.yml as part of a build process? I can't see a clearcut case where this would be disastrous, but it's nice, for instance, to have the modification date on the data csv be the date that the user actually edited it, not the last time they ran a wax build that updated it.

So just have a file at like _{collection}/{collection}.csv that wax creates which is the user csv from _data/{collection}.csv but with the columns order,layout,collection,thumbnail,full,manifest, and whatever else you're creating added.

Describe alternatives you've considered

Leave it as is? I don't know what you're thinking of to allow this to get clobbered.
I've occasionally used a solution of having something called catalog_derived.json in the same folder, but using the collection folder seems more robust to me.

Additional context
Add any other context or screenshots about the feature request here.

Custom variants are not being generated

Custom variant image sizes specified in _config.yml for a given collection are not being generated.

To Reproduce

  • In out-of-the-box wax, add custom variant sizes to a collection in _config.yml, following the example in the sample site
  • render the iiif derivatives with bundle exec rake wax:derivatives:iiif

Expected behavior
An image such as obj12_00 ought to have the following full-region images in img/derivatives/iiif/images/obj12_00/full/: 50, 250, 1400, full. Instead, only the default sizes are found: 250, full.

Desktop (please complete the following information):

  • OS: Ubuntu 19.10
  • Browser: n/a
  • Version: n/a

Additional context
This is an important feature to make Wax IIIF images fully reusable, since Universal Viewer always requests 90, thumbnails. An item without 90, thumbnails always looks broken in UV.

overwrite yaml records fails

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Smartphone (please complete the following information):

  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. stock browser, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Add a `wax:clobber` task to remove all files generated by wax_tasks

  • should remove generated pages (e.g., _qatar) and images (e.g., img/derivatives), and search index (e.g., search/index.json)
  • should leave the data (e.g., _data/raw_images and _data/qatar.csv)
  • BUT should remove generated columns/fields from metadata file (e.g., manifest from _data/qatar.csv)

Is lowercase necessary for PIDs?

In the "Requirements" section of the Metadata file description, it notes that the pid:

should follow β€œsnake case”, where letters are lowercase, special characters are removed, and spaces are replaced by underscores

Is the lowercase requirement actually a requirement? Out PIDs and corresponding image files include uppercase characters. We could rename the files... but that turns a seamless production workflow into a slightly more complicated / error-prone one.

Reference:

Utils.remove_diacritics(str).to_s.downcase.tr(' ', '_').gsub(/[^\w-]/, '')

page order is incorrect

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Smartphone (please complete the following information):

  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. stock browser, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.