Giter VIP home page Giter VIP logo

harvest's Introduction

Harvest - manage large collections of files and dirs

Harvest makes it easy to list files and folders by type and copy or move them around.

Kant handle my swag

Harvest is a compact and portable script to scan files and folders and recognise their typology. Scanning is based on file extensions and a simple fuzzy logic analysis of folder contents (not just files) to recognise if they are related to video, audio or text materials, etc.

Harvest is fast: it can read approximately 1GB of stored filenames per second and is operated from the console terminal. It never modifies the filesystem: that is done explicitly by the user piping shell commands.

Software by Dyne.org

Harvest operates on folders containing files without exploding the files around: it assesses the typology of a folder from the files contained, but does not promote move the files outside of that folder. For instance it works very well to move around large collections of downloaded torrent folders.

๐Ÿ’พ Installation

Harvest is a Zsh script and works on any POSIX platform where it can be installed including GNU/Linux, Apple/OSX and MS/Windows.

Install the latest harvest with:

curl https://raw.githubusercontent.com/dyne/harvest/main/harvest | sudo tee /usr/local/bin/harvest

Dependencies: zsh

Optional:

  • fuse tmsu for tagged filesystem
  • setfattr for setting file attributes

๐ŸŽฎ Usage

Scan a folder /PATH/ to show and save results

 harvest scan [PATH]

List of supported category types:

 code image video book text font web archiv sheet exec slide audio

Move all scanned text files in /PATH/ to /DEST/

 harvest scan [PATH] | grep ';text;' | xargs -rn1 -I% mv % [DEST]

Tag all file attributes in /PATH/ with harvest.type categories

 harvest attr [PATH]

Tag all files for use with TMSU (See section below about TMSU)

 harvest tmsu [PATH]

TMSU

This implementation supports tagged filesystems using TMSU.

To allow the navigation of files in the style of a Semantic Filesystem, Harvest supports TMSU, an small utility to maintain a database of tags inside an hidden directory .tmsu in each harvested folder.

To initialise a tmsu database bootstrapped with harvest's tags in the currently harvested folder, do:

harvest tmsu

Directories indexed this way can then be "mounted" (using fuse) and navigated:

harvest mount

Inside the $harvest hidden subfolder (pointing to .mnt inside the folder) tags will become folders containing symbolic links to the actual tagged files. Any filemananger following symbolic links can be used to navigate tags, also tags will be set as bookmarks in graphical filemanagers (GTK3 supported).

In addition to the tags view, there is also a queries folder in which you can run view queries by listing or creating new folders:

ls -l "$harvest/queries/text and 2018"

This automatic creation of the query folders makes it possible to use new file queries within the file chooser of a graphical program simply by typing the query in. Unwanted query folders can be safely removed.

Limited tag management is also possible via the virtual filesystem. For example one can remove specific tags from a file by deleting the symbolic link in the tag folder, or delete a tag by performing a recursive delete.

To unmount all TMSU semantic filesystems currently mounted, just do:

harvest umount

Further TMSU operations are possible operating directly from inside the directories that have been indexed using harvest tmsu, for more information see tmsu help. For instance, TMSU also detects duplicate files using tmsu dupes.

harvest's People

Contributors

g-gundam avatar jaromil avatar puria avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

harvest's Issues

mv and cp are not implemented so I wrote a wrapper

harvest/src/harvest.lua

Lines 168 to 178 in a9466f9

cli:command("mv", "move the selection to destination")
:argument('dest','folder to which the selection will be moved')
:action(function(options)
print("TODO: mv to destination: "..options.dest)
end)
cli:command("cp", "copy the selection to destination")
:argument('dest','folder to which the selection will be moved')
:action(function(options)
print("TODO: mv to destination: "..options.dest)
end)

I like the idea behind harvest, and it was pretty close to what I needed for my ~/Downloads cleanup project, but I saw that the mv and cp commands were not implemented yet. However, it seemed like I could make a wrapper around the current unfinished harvest and implement cp and mv myself. The result was hvst.

Usage:
  hvst
  hvst help
  hvst cp <TYPE> <DEST> [OPTION]...
  hvst mv <TYPE> <DEST> [OPTION]...

Examples:

  # Display help message.
  hvst help

  # List files in current directory.
  hvst

  # Copy all images to ~/Pictures.
  hvst cp image ~/Pictures

  # Copy all images to ~/Pictures and be verbose.
  hvst cp image ~/Pictures --verbose

  # Show what would happen if we tried to move all images to ~/Pictures/YYYY/MM.
  hvst mv image '@"~/Pictures/$_->{year}/$_->{month}"' --recon

  # Move all images to ~/Pictures/YYYY/MM.
  hvst mv image '@"~/Pictures/$_->{year}/$_->{month}"'

  # Both cp and mv commands take --verbose and --recon (or -v and -r for short).

I'm posting here in case someone finds this useful.

hanging while scanning a dir with 200_000 images

Hi guys!

I tried and scan a dir with 200k image file with the command:

harvest -p DIR_PATH -t image --file

I get the prompt

Harvest DIR_PATH
type: image

and then it hangs indefinitely.

For comparison, ls DIR_PATH takes 0m1,543s to execute (and prompt all 200k files).

Am I doing something wrong?

can't compile fatal error: harvest.lua.c: No such file or directory

After the first attempt I needed to install luastatic then I'am getting the next error:

$ git submodule update --init --recursive
$ make                                                                                                                               ๎‚ฒ โœ” 
CC=cc AR=ar INCLUDE="-I /usr/include/luajit-2.1" LDADD="/usr/lib/x86_64-linux-gnu/libluajit-5.1.a -lm -ldl" CFLAGS="-O3 --fast-math" make -C src
make[1]: Entering directory '/home/sliss/source/harvest/src'
cc -c -I /usr/include/luajit-2.1 -O3 --fast-math lfs.c -o lfs.o
ar rcs lfs.a lfs.o
CC="" luastatic harvest.lua cliargs.lua cliargs/*.lua cliargs/utils/*.lua align.lua lfs.a
cc -c -I /usr/include/luajit-2.1 -O3 --fast-math harvest.lua.c
cc1: fatal error: harvest.lua.c: No such file or directory
compilation terminated.
make[1]: *** [Makefile:9: all] Error 1
make[1]: Leaving directory '/home/sliss/source/harvest/src'
make: *** [Makefile:24: luastatic] Error 2

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.