diybookscanner / spreads Goto Github PK

Modular workflow assistant for book digitization

License: GNU Affero General Public License v3.0

Python 68.75% Makefile 0.10% HTML 0.09% CSS 5.45% JavaScript 25.07% NSIS 0.54%

spreads's Introduction

spreads is a software suite for the digitization of printed material. Its main focus is to integrate existing solutions for individual parts of the scanning workflow into a cohesive package that is intuitive to use and easy to extend.

At its core, it handles the communication with the imaging devices, the post-processing of the captured material and its assembly into output formats like PDF or ePub. On top of this base layer, we have built a variety of interfaces that should fit into most use cases: A full-fledged and mobile-friendly web interface that can be served from even the most low-powered devices (like a Raspberry Pi), a graphical wizard for classical desktop users and a bare-bones command-line interface for purists.

As for extensibility, we offer a plugin API that allows developers to hook into almost every part of the architecture and extend the application according to their needs. There are interfaces for developing a device driver to communicate with new hardware and for writing new postprocessing or output plugins to take advantage of a as of yet unsupported third-party software. There is even the possibility to create a completely new user interface that is better suited for specific environments.

Features

Support for cameras running CHDK as well as cameras supported by libgphoto2 (experimental), with extensive configuration options.
Cropping of the images during capture (only supported in web interface)
Shoot with two devices simultaneously, directly storing the images in a single directory on your computer in the right order.
Automatically rotate images
Run captured images through ScanTailor (attended or unattended)
Recognize text from the images through Tesseract OCR
Generate PDF and DJVU files with hidden text layers
Every project is stored in a directory on your computer and contains all the information that is needed in human-readable form, laid out according to the BagIt specification. This makes it easy to exchange projects between computers.

Interfaces

Web

The interface with the most features. You have the choice between three modes: scanner, processor and full. The first is ideal for slim scanning workstations that just deal with the capturing of the images and little more. From it, you can transfer your scans either to an USB stick or another instance of spreads running in one of the other two modes (all from your browser!), where they will be post-processed. It is currently the only interface to support cropping during capture and on-the-fly changing of settings during capture.

GUI

A graphical wizard that guides you through every step, from setting up the devices to postprocessing the images

CLI

A text-only command-line interface that exposes each step as a subcommand. Ideal for controlling a scanner over SSH and for command-line fetishists.

Getting Started

Install the system dependencies:
- sudo apt-get install python2.7-dev python-pip build-essential pkg-config libffi-dev libturbojpeg-dev libmagickwand-dev python-cffi
If you want to use a CHDK device:
- sudo apt-get install liblua5.2-dev libusb-dev
- sudo pip install lupa --install-option="--no-luajit"
- sudo pip install chdkptp.py
If you want to use a libgphoto2-supported device:
- sudo apt-get install libgphoto2-dev
- sudo pip install gphoto2-cffi
Install Python dependencies:
- sudo pip install jpegtran-cffi Flask requests zipstream tornado Wand
- sudo pip install http://buildbot.diybookscanner.org/nightly/spreads-latest.tar.gz
If you want to use the GUI:
- sudo apt-get install python-pyside
If you want to use djvu functionality:
- sudo apt-get install djvubind

Configure spreads and select the plugins you want to use:
- spread configure

Documentation

You can find the detailed manual for users and developers at http://spreads.readthedocs.org

Please note that it is currently woefully incomplete and partially out of date. If you want to help with it, please get in touch!

Getting Help

IRC: irc.freenode.net, #diybookscanner
Forums: http://diybookscanner.org/forums
Bugtracker: https://github.com/DIYBookScanner/spreads/issues

spreads's People

Contributors

Stargazers

Watchers

spreads's Issues

Cannot set orientation

Setting the cameras' orientation fails:

Please connect and turn on the devices.
Press any key to continue.
Detecting devices.
PTPDevice: Could not get orientation, reason: Lua error: runtime error: :3: attempt to index global 'file' (a nil value)
PTPDevice: Could not get orientation, reason: Lua error: runtime error: :3: attempt to index global 'file' (a nil value)
Devices not yet configured!
Please turn both devices off. Press any key when ready.
Please connect and turn on the device labeled 'left'
Press any key when ready.
PTPDevice: Could not get orientation, reason: Lua error: runtime error: :3: attempt to index global 'file' (a nil value)
Configured 'left' device.
Please turn off the device.
Press any key when ready.
Please connect and turn on the device labeled 'right'
Press any key when ready.
PTPDevice: Could not get orientation, reason: Lua error: runtime error: :3: attempt to index global 'file' (a nil value)
Could not close session!
pyptpchdk: Could not close session!
Segmentation fault (core dumped)

Metadata support

Add support for Metadata to the spreads workflow.
Users should be able to specify the usual information about digitized content, like author, title, publisher, year, edition, etc.
One approach would be to add a get_metadata function to the workflow, which takes keyword arguments as input and forwards them to plugins, which will return a SpreadsMeta object, that can then be further processed.
This can then be used to enrich the various output formats and to create more meaningful filenames.

DevicePlugin should support other transports for capture

Maybe it already does! Will close this issue if I can write a RaspberryPi over-ethernet capture plugin.

Display log in GUI during postprocessing

In addition to progress bars (issue #5), it would be great to display the current logging output in the QTextEdit on the postprocessing page of the GUI wizard.

autorotate

i've an issue with autorotate:

qabbal# spread wizard --shutter-speed 1/10 cyn
==========================
 Starting capturing process
 ==========================
Found 2 devices!
Setting up devices for capturing.
( /b) capture | (r) retake last shot | (f) finish
Shot   8 pages [1147/h] =======================
Starting postprocessing
=======================
spreads encountered an error:
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/spreads-0.0.0-py2.7.egg/EGG-INFO/scripts/spread", line 39, in <module>
    cli.main()
  File "/usr/local/lib/python2.7/dist-packages/spreads-0.0.0-py2.7.egg/spreads/cli.py", line 452, in main
    args.subcommand(config)
  File "/usr/local/lib/python2.7/dist-packages/spreads-0.0.0-py2.7.egg/spreads/cli.py", line 290, in wizard
    postprocess(config)
  File "/usr/local/lib/python2.7/dist-packages/spreads-0.0.0-py2.7.egg/spreads/cli.py", line 270, in postprocess
    workflow.process()
  File "/usr/local/lib/python2.7/dist-packages/spreads-0.0.0-py2.7.egg/spreads/workflow.py", line 217, in process
    self._run_hook('process', self.path)
  File "/usr/local/lib/python2.7/dist-packages/spreads-0.0.0-py2.7.egg/spreads/workflow.py", line 121, in _run_hook
    getattr(plugin, hook_name)(*args)
  File "/usr/local/lib/python2.7/dist-packages/spreads-0.0.0-py2.7.egg/spreadsplug/autorotate.py", line 36, in process
    if imgpath.lower()[-4:] not in ('.jpg', 'jpeg'):
AttributeError: 'PosixPath' object has no attribute 'lower'

Error detecting devices

Hi,

I am trying to run spreads on Ubuntu 12.04 to simultaneously trigger two Canon Powershot A810 with CHDK. The spreads wizard fails when trying to detect my cameras, with the error message below. Both cameras are connected and turned on. Not sure what I am doing wrong. Any ideas on how to solve this will be greatly appreciated! Thank you in advance.

Please connect and turn on the devices.
�Press any key to continue.
�Detecting devices.
�spreads encountered an error:
�Traceback (most recent call last):
File "/usr/local/bin/spread", line 39, in
spreads.cli.main()
File "/usr/local/lib/python2.7/dist-packages/spreads/cli.py", line 298, in main
args.func(args)
File "/usr/local/lib/python2.7/dist-packages/spreads/cli.py", line 159, in wizard
devices = get_devices()
File "/usr/local/lib/python2.7/dist-packages/spreads/plugin.py", line 325, in get_devices
matches = filter(None, devicemanager.map(match, device))
File "/usr/local/lib/python2.7/dist-packages/stevedore/extension.py", line 137, in map
raise RuntimeError('No %s extensions found' % self.namespace)
RuntimeError: No spreadsplug.devices.dev extensions found

Autorotate not working in latest release?

The autorotate plugin used to work fine for me, but after running

git fetch upstream
git merge upstream/master

earlier today to update the A810 fork with the latest changes, I now get:

$ spread -v postprocess ~/journalconchy
spreads.workflow: Starting postprocessing...
spreads.workflow: Running process hooks
spreadsplug.autorotate: Rotating images in /home/asartori/journalconchy/raw
root: Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/spreads-0.3.3-py2.7.egg/EGG-INFO/scripts/spread", line 39, in <module>
    spreads.cli.main()
  File "/usr/local/lib/python2.7/dist-packages/spreads-0.3.3-py2.7.egg/spreads/cli.py", line 313, in main
    args.func(args)
  File "/usr/local/lib/python2.7/dist-packages/spreads-0.3.3-py2.7.egg/spreads/cli.py", line 140, in postprocess
    workflow.process(path)
  File "/usr/local/lib/python2.7/dist-packages/spreads-0.3.3-py2.7.egg/spreads/workflow.py", line 140, in process
    ext.obj.process(path)
  File "/usr/local/lib/python2.7/dist-packages/spreads-0.3.3-py2.7.egg/spreadsplug/autorotate.py", line 72, in process
    if self.config['rotate_inverse'].get(bool):
  File "/usr/local/lib/python2.7/dist-packages/spreads-0.3.3-py2.7.egg/spreads/confit.py", line 311, in get
    value, _ = self.first()
  File "/usr/local/lib/python2.7/dist-packages/spreads-0.3.3-py2.7.egg/spreads/confit.py", line 168, in first
    raise NotFoundError("{0} not found".format(self.name))
NotFoundError: rotate_inverse not found

There is a problem with your configuration file(s):
rotate_inverse not found

need futures==2.1.4 , 2.1.5 & 2.1.6 cause problems?

I have [email protected]

$ sudo pip uninstall futures
$ sudo pip install futures
Downloading/unpacking futures
  Downloading futures-2.1.6-py2.py3-none-any.whl
Installing collected packages: futures
Successfully installed futures
Cleaning up...
$ rm .config/spreads/config.yaml 
$ spread configure
Please select a device driver from the following list:
  1: chdkcamera
  2: a2200
  3: a1400
Select a driver: 1
Selected "chdkcamera" as device driver
Please select your desired plugins from the following list:
    1: scantailor
    2: web
    3: tesseract
    4: djvubind
    5: gui
    6: autorotate
    7: intervaltrigger
    8: hidtrigger
    9: pdfbeads
Select a plugin (or hit enter to finish): 
stevedore.extension: Could not load 'chdkcamera': futures>=2.1.4
stevedore.extension: futures>=2.1.4
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/stevedore/extension.py", line 134, in _load_plugins
    invoke_kwds,
  File "/usr/local/lib/python2.7/dist-packages/stevedore/named.py", line 95, in _load_one_plugin
    ep, invoke_on_load, invoke_args, invoke_kwds,
  File "/usr/local/lib/python2.7/dist-packages/stevedore/extension.py", line 146, in _load_one_plugin
    plugin = ep.load()
  File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 1988, in load
    if require: self.require(env, installer)
  File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 2001, in require
    working_set.resolve(self.dist.requires(self.extras),env,installer))
  File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 584, in resolve
    raise DistributionNotFound(req)
DistributionNotFound: futures>=2.1.4
spreads encountered an error:
Traceback (most recent call last):
  File "/usr/local/bin/spread", line 39, in <module>
    cli.main()
  File "/usr/local/lib/python2.7/dist-packages/spreads/cli.py", line 446, in main
    args.subcommand(config)
  File "/usr/local/lib/python2.7/dist-packages/spreads/cli.py", line 158, in configure
    plugin.setup_plugin_config(config)
  File "/usr/local/lib/python2.7/dist-packages/spreads/plugin.py", line 368, in setup_plugin_config
    driver = get_driver(config["driver"].get(unicode))
  File "/usr/local/lib/python2.7/dist-packages/spreads/plugin.py", line 346, in get_driver
    name=driver_name)
  File "/usr/local/lib/python2.7/dist-packages/stevedore/driver.py", line 31, in __init__
    invoke_kwds=invoke_kwds,
  File "/usr/local/lib/python2.7/dist-packages/stevedore/named.py", line 44, in __init__
    self._init_plugins(extensions)
  File "/usr/local/lib/python2.7/dist-packages/stevedore/driver.py", line 66, in _init_plugins
    (self.namespace, name))
RuntimeError: No u'spreadsplug.devices' driver found, looking for u'chdkcamera'
$ sudo pip uninstall futures
$ sudo pip install futures==2.1.4
Downloading/unpacking futures==2.1.4
  Downloading futures-2.1.4.tar.gz
  Running setup.py (path:/tmp/pip_build_root/futures/setup.py) egg_info for package futures
Installing collected packages: futures
  Running setup.py install for futures
Successfully installed futures
Cleaning up...
$ rm .config/spreads/config.yaml 
$ spread configure
Please select a device driver from the following list:
  1: chdkcamera
  2: a2200
  3: a1400
Select a driver: 1
Selected "chdkcamera" as device driver
Please select your desired plugins from the following list:
    1: scantailor
    2: web
    3: tesseract
    4: djvubind
    5: gui
    6: autorotate
    7: intervaltrigger
    8: hidtrigger
    9: pdfbeads
Select a plugin (or hit enter to finish): 
Do you want to configure the target_page of your devices?
(Required for shooting with two devices) [y/n]: 
Do you want to setup the focus for your cameras? [y/n]: 
Writing configuration file to '/home/matti/.config/spreads/config.yaml'
$

Web interface for capturing process

Add a plugin that provides a (mobile-friendly) web interface for controlling the scanner, making the use of a dedicated terminal for scanning stations obsolete.

Unstable capturing

My cameras are unstable when capturing. I am not sure if the problem is hardware/software problem on the camera (correct chdk version?) or if the problem is something to do with Spreads. I have tested with different USB cables, different Linux versions and another computer.

When I hit the capture button now, my cameras sometimes capture simultaneous and sometimes it takes 2-5 seconds after the first cam has captured until the second camera captures. I also get a lot of "script timed out" messages.

Download command fails if cameras have RAW images

Hi,

After setting my cameras to save in RAW format (for better color adjustment, etc) spread's download command fails after completing download of the first image from left and right cameras. The log is:

PTPDevice[left]: Camera returned: 1 CRW_0001.CRW
2   IMG_0001.JPG
3   CRW_0002.CRW
4   IMG_0002.JPG

PTPDevice[right]: Camera returned: 1    CRW_0001.CRW
2   IMG_0001.JPG
3   CRW_0002.CRW
4   IMG_0002.JPG

PTPDevice[left]: Downloading "/home/bookscan/test3/left/CRW_0001.CRW"
PTPDevice[right]: Downloading "/home/bookscan/test3/right/CRW_0001.CRW"
PTPdevice[right]: Download complete
PTPdevice[left]: Download complete
spreads encountered an error:
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/spreads-0.3.3-py2.7.egg/EGG-INFO/scripts/spread", line 39, in <module>
    spreads.cli.main()
  File "/usr/local/lib/python2.7/dist-packages/spreads-0.3.3-py2.7.egg/spreads/cli.py", line 313, in main
    args.func(args)
  File "/usr/local/lib/python2.7/dist-packages/spreads-0.3.3-py2.7.egg/spreads/cli.py", line 134, in download
    workflow.download(devices, path)
  File "/usr/local/lib/python2.7/dist-packages/spreads-0.3.3-py2.7.egg/spreads/workflow.py", line 119, in download
    raise exc
InvalidFile: Error reading soi_marker. Got <> should be <��>

Thanks in advance for any help!

Improve logging

Currently, logging is handled globally via calls to the root logger of logging
It would be better if every component of spreads and every plugin would declare its own logger.
This would make implementing progress updates in the GUI (issue #6 ) easier and allow for easier and more focused debugging.

Display RGB data from camera directly in PySide

Currently in the GUI, the raw RGB image is grabbed from the camera, converted to a PIL Image object and converted to a QPixmap through PIL's QtImage function.
It would be good to skip the two intermediary steps and render the data directly to a QPixmap. This way, we could display a continuous stream with a reasonable framerate from the camera instead of single snapshots.

exiftool rotation makes performance crawl

When troubleshooting performance of spreads, we were constantly waiting for exiftool rotation.

We commented this out in chdkcamera.py and got a x4 speedup:

    # Set EXIF orientation
    #self.logger.debug("Setting EXIF orientation on captured image")
    #if self.target_page == 'odd':
    #    exif_orientation = 8  # 90°
    #else:
    #    exif_orientation = 6  # -90°
    #subprocess.check_output(
    #    ['exiftool', '-Orientation={0}'.format(exif_orientation), '-n',
    #     '-overwrite_original', local_path])

In addition, the images do show up in the web frontend now (though obviously not rotated).

Pieter suggests to rotate on the client for now in css/js. A small line of code to update coming in this ticket in 5 minutes.

Change zoom in gui

I think it would be a greate that you can change the zoom level bevor you start a scan, so you can adjust to the size of your book.
How could that be theoretically done?

Change config.yaml and reload? ptp Command to the camera, than the change should stay for one capture process?

chdkptp_path in config file is ignored; errors are not helpful

I have my config pointing chdkptp_path to a particular dir:

# Valid values: 'none', 'debug', 'info', 'warning', 'error', 'critical'
loglevel: info
plugins: []

# Options for 'capture' step
capture:
    capture_keys: [' ', b]
driver: a1400
verbose: no
logfile: ~/.config/spreads/spreads.log
device:
    dpi: 300
    parallel_capture: yes
    chdkptp_path: /mnt/bulk/spreads/chdkptp-lib
    zoom_level: 3
    shoot_raw: no
    sensitivity: 80
    focus_distance: auto
    flip_target_pages: no
    shutter_speed: 1/25

But I get this error:

$ spread --verbose capture .
Workflow: Initializing workflow .
spreads.plugin: Creating plugin manager
stevedore: found extension EntryPoint.parse('scantailor = spreadsplug.scantailor:ScanTailorPlugin')
stevedore: found extension EntryPoint.parse('web = spreadsplug.web:WebCommands')
stevedore: found extension EntryPoint.parse('tesseract = spreadsplug.tesseract:TesseractPlugin')
stevedore: found extension EntryPoint.parse('djvubind = spreadsplug.djvubind:DjvuBindPlugin')
stevedore: found extension EntryPoint.parse('gui = spreadsplug.gui:GuiCommand')
stevedore: found extension EntryPoint.parse('autorotate = spreadsplug.autorotate:AutoRotatePlugin')
stevedore: found extension EntryPoint.parse('intervaltrigger = spreadsplug.intervaltrigger:IntervalTrigger')
stevedore: found extension EntryPoint.parse('hidtrigger = spreadsplug.hidtrigger:HidTrigger')
stevedore: found extension EntryPoint.parse('pdfbeads = spreadsplug.pdfbeads:PDFBeadsPlugin')
stevedore.extension: found extension EntryPoint.parse('chdkcamera = spreadsplug.dev.chdkcamera:CHDKCameraDevice')
stevedore.extension: found extension EntryPoint.parse('a2200 = spreadsplug.dev.chdkcamera:CanonA2200CameraDevice')
stevedore.extension: found extension EntryPoint.parse('a1400 = spreadsplug.dev.chdkcamera:CanonA1400CameraDevice')
spreads.plugin: Finding devices for driver "<stevedore.driver.DriverManager object at 0x29c4850>"
Instantiating device...
ChdkCamera: Device has serial number 20420664CAD94BDDA535E1F926988EEC
ChdkCamera: Calling chdkptp with arguments: [u'/usr/local/lib/chdkptp/chdkptp', '-c-d=039 -b=001', '-eset cli_verbose=2', '-eluar return(get_buildinfo())']
spreads encountered an error:
Traceback (most recent call last):
  File "/usr/local/bin/spread", line 39, in <module>
    cli.main()
  File "/usr/local/lib/python2.7/dist-packages/spreads/cli.py", line 446, in main
    args.subcommand(config)
  File "/usr/local/lib/python2.7/dist-packages/spreads/cli.py", line 242, in capture
    if len(workflow.devices) != 2:
  File "/usr/local/lib/python2.7/dist-packages/spreads/workflow.py", line 79, in devices
    self._devices = plugin.get_devices(self.config)
  File "/usr/local/lib/python2.7/dist-packages/spreads/plugin.py", line 359, in get_devices
    devices = list(driver_class.yield_devices(config['device']))
  File "/usr/local/lib/python2.7/dist-packages/spreadsplug/dev/chdkcamera.py", line 396, in yield_devices
    yield cls(config, dev)
  File "/usr/local/lib/python2.7/dist-packages/spreadsplug/dev/chdkcamera.py", line 380, in __init__
    super(CanonA1400CameraDevice, self).__init__(config, device)
  File "/usr/local/lib/python2.7/dist-packages/spreadsplug/dev/chdkcamera.py", line 94, in __init__
    get_result=True)
  File "/usr/local/lib/python2.7/dist-packages/spreadsplug/dev/chdkcamera.py", line 241, in _execute_lua
    output = self._run("{0} {1}".format(cmd, script))
  File "/usr/local/lib/python2.7/dist-packages/spreadsplug/dev/chdkcamera.py", line 227, in _run
    stderr=subprocess.STDOUT)
  File "/usr/lib/python2.7/subprocess.py", line 537, in check_output
    process = Popen(stdout=PIPE, *popenargs, **kwargs)
  File "/usr/lib/python2.7/subprocess.py", line 679, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1259, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

and with --verbose disabled, we see an extremely unhelpful (no mentions of path, chdkptp, etc.) error message:

$ spread capture .
Instantiating device...
spreads encountered an error:
Traceback (most recent call last):
  File "/usr/local/bin/spread", line 39, in <module>
    cli.main()
  File "/usr/local/lib/python2.7/dist-packages/spreads/cli.py", line 446, in main
    args.subcommand(config)
  File "/usr/local/lib/python2.7/dist-packages/spreads/cli.py", line 242, in capture
    if len(workflow.devices) != 2:
  File "/usr/local/lib/python2.7/dist-packages/spreads/workflow.py", line 79, in devices
    self._devices = plugin.get_devices(self.config)
  File "/usr/local/lib/python2.7/dist-packages/spreads/plugin.py", line 359, in get_devices
    devices = list(driver_class.yield_devices(config['device']))
  File "/usr/local/lib/python2.7/dist-packages/spreadsplug/dev/chdkcamera.py", line 396, in yield_devices
    yield cls(config, dev)
  File "/usr/local/lib/python2.7/dist-packages/spreadsplug/dev/chdkcamera.py", line 380, in __init__
    super(CanonA1400CameraDevice, self).__init__(config, device)
  File "/usr/local/lib/python2.7/dist-packages/spreadsplug/dev/chdkcamera.py", line 94, in __init__
    get_result=True)
  File "/usr/local/lib/python2.7/dist-packages/spreadsplug/dev/chdkcamera.py", line 241, in _execute_lua
    output = self._run("{0} {1}".format(cmd, script))
  File "/usr/local/lib/python2.7/dist-packages/spreadsplug/dev/chdkcamera.py", line 227, in _run
    stderr=subprocess.STDOUT)
  File "/usr/lib/python2.7/subprocess.py", line 537, in check_output
    process = Popen(stdout=PIPE, *popenargs, **kwargs)
  File "/usr/lib/python2.7/subprocess.py", line 679, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1259, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

Everything works fine if I move my chdkptp installation into /usr/local/lib/ ... but then why did I even have the option to configure it?

HookPlugin wants plugins to override methods for all hookpoints, and hookpoints are sought by unicode name

Instead, one could have multiple classes inheriting from the HookPlugin super-class, one for each hookpoint.

Then, if a plugin did want to override multiple hookpoints, it could do a "mix-in" (multple class inheritance).

one device dies, scan cycle goes tits up

Symptoms:
While in a (webplugin) scanning cycle, switch off one of the cameras. Switch it on again. The camera now has a new device id on the same usb bus. The scanning cycle goes tits up.

Expected behaviour:
While in a (webplugin) scanning cycle, switch off one of the cameras. Switch it on again. The camera now has a new device id on the same usb bus. Spreads recognises this and continues its work.

DevicePlugin is coupled with USB devices

A subclass of DevicePlugin, USBDevicePlugin, should be written to hold the usb enumeration/selection code.

AttributeError: 'Device' object has no attribute 'bInterfaceSubClass'

(spreads .3.3 and git as of today)

spread --verbose download yields these warnings. Doesn't seem too harmful, but I thought I'd mention them.

stevedore 0.14 causes spreads to fail at startup

$ spread
spreads encountered an error:
Traceback (most recent call last):
File "/tmp/spreads/spread", line 39, in
cli.main()
File "/tmp/spreads/spreads/cli.py", line 412, in main
parser = setup_parser(config)
File "/tmp/spreads/spreads/cli.py", line 333, in setup_parser
pluginmanager = get_pluginmanager(config)
File "/tmp/spreads/spreads/plugin.py", line 344, in get_pluginmanager
name_order=True)
File "/usr/local/lib/python2.7/dist-packages/stevedore/named.py", line 55, in init
verify_requirements)
TypeError: _load_plugins() takes exactly 4 arguments (5 given)

Cameras fail silently to shoot

Sometimes one of the cameras fails silently to capture. That is, the log looks absolutely fine, but one of the cameras has actually missed a page. Regarding this issue, jbaiter wrote: "The problem is that the 'capture' currently does not wait for feedback from the device but assumes everything to be fine when there was no PTP error.
I just saw in the documentation, though, that in CHDK 1.2 (the current development version) the Lua shoot command (that we use to trigger the capture) returns a value indicating whether or not the shot was successful: http://chdk.wikia.com/wiki/CHDK_scripting#shoot
I'll try to update my firmware on the weekend and see if I can integrate that into the code, maybe that can shed some more light on your problem."

Error messages in GUI

The following error message is printed multiple times when running the GUI wizard.
It's not critical, as the program still works, though still annoying.

(python:4635): Gtk-CRITICAL **: IA__gtk_progress_configure: assertion `value >= min && value <= max' failed

controller device ip configuration

The simplest way:

user connects controller device to postprocessing workstation directly (ethernet crossover cable link)
dhcp server on controller device assigns user ip
controller device dhcp server assigns its own minimal dns server to user. See http://code.activestate.com/recipes/491264-mini-fake-dns-server/?in=lang-python for a candidate.
user opens browser and goes to whatever address
controller device dns server responds with its own ip to every dns request

Another scenario I could think of (only 90%+ guaranteed result):

user connects controller device to his network.
network assigns controller device ip via dhcp.
controller device stores last received dhcp client ip to sd card.
user powers down controller device.
user pulls out sd card and writes down dhcp client ip stored on sd card.
user immediately puts back sd card and reboots controller device.
controller device receives same dhcp lease (this should easily be 90%+ of cases).
user navigates browser to read out ip.

Fallback scenario (requires access to router and willingness to try and understand instructions beyond ordinary user level):

user connects controller device to his network
network assigns controller device ip via dhcp.
user reads controller device from router dhcp client table based upon controller device mac address
user connects to ip

Another option may be to make the controller device avahi discoverable https://en.wikipedia.org/wiki/Avahi_(software) and http://www.linuxplanet.com/linuxplanet/reports/6826/1 .

https://en.wikipedia.org/wiki/Zeroconf

Windows Vista, 7, 8 use LLMNR https://en.wikipedia.org/wiki/LLMNR . GPL ipV6 implementation at http://www.vx68k.org/xllmnrd .

(old?) windows only protocol may catch some more scenarios: nmbd http://www.samba.org/samba/docs/man/manpages-3/nmbd.8.html

Better progress notification in GUI

Currently all of the progress bars in the GUI wizard just act as activity indicators but don't really give the user a way to estimate how much work has already been done.
It would therefore be benefitial to find a way to make these progress bars, where possible, display the current progress. Candidates for this include:

Downloading/Deleting files
Postprocessing

To begin, it would be enought to only update the progress bar once for every plugin executed.

Web interface for postprocessing and output generation

Add a plugin that provides a web-based control over scanning and output generation workflows.

The following features should be included:

Allow clients to submit new projects through a RESTful API
Manage workflow queue, edit/delete/rearrange projects
Download/upload ScanTailor configuration files for manual inspection/editing
Download generated output files

webplugin configuration issue

raspberry pi on 993b4df ...

(spreadswebplugin)pi@raspberrypi ~/spreadswebplugin $ spread configure
Please select a device driver from the following list:


Select a driver: 2
Selected "a2200" as device driver
Please select your desired plugins from the following list:
    1: scantailor
    2: web
    3: djvubind
    4: gui
    5: colorcorrect
    6: autorotate
    7: tesseract
    8: pdfbeads
Select a plugin (or hit enter to finish): 2
    1: scantailor
  ✔ 2: web
    3: djvubind
    4: gui
    5: colorcorrect
    6: autorotate
    7: tesseract
    8: pdfbeads
Select a plugin (or hit enter to finish): 
The following postprocessing plugins were detected:

Please enter the extensions in the order that they should be invoked, separated by commas:

At least one of the entered extensions was notfound, please try again!

(stuck in infinite loop here)

Improve test suite for core.

Currently, the test suite is severely lacking.
At least the spreads package should aim for coverage >80%.

Windows support

This would involve the following steps:

Test if all packages work properly on Windows (pyptpchdk?)
How to deal with setuptools/pkg_resources and plugins?
Configuration for external applications
Package into a nice MSI installer.

power down button in web interface

The rpi sd card is quite vulnerable to hard poweroff. We need a shutdown button in the web interface. (I wonder how to do that in the controller only version btw.)

Test suite for plugins

Currently, there are no tests for plugins.
Every plugin should have at least 80% test coverage.

Workflow object stores application state

The workflow object wants to be some sort of "macro"-like object, which is setup/loaded from a previously saved configuration, then used again and again.

This could be separated into a pickle-able (or yaml-able etc.) object and an ApplicationState object. "Workflow" would then be a configuration object, and should
maybe be called SpreadsConfiguration? I like the name workflow... Man this is hard.

Bug in scantailor plugin?

Hi,

Whenever I run spread -v postprocess with the scantailor plugin active, the results are:

1-) Spreads calls scantailor and waits, I guess, for a signal that the files have been processed by that program:

spreadsplug.autorotate: Updating EXIF information for image '/home/asartori/bookscan/dickerson1/raw/0038.JPG'
spreadsplug.scantailor: Generating ScanTailor configuration
spreadsplug.scantailor: scantailor-cli --start-filter=2 --end-filter=5 --layout=1.5 --dpi=300 -o=/home/asartori/bookscan/dickerson1/dickerson1.ScanTailor --margins-top=2.5 --margins-right=2.5 --margins-bottom=2.5 --margins-left=2.5 /home/asartori/bookscan/dickerson1/raw /home/asartori/bookscan/dickerson1/done
spreadsplug.scantailor: Opening ScanTailor GUI for manual adjustment

2-) However, scantailor is not pleased with that spread's command, and prints on screen "The following file could not be loaded: /home/asartori/bookscan/dickerson1/raw". So, it seems that scantailor expects file names where spreads issues the name of the input folder.

Add test for specific known good chdk firmware version

It would make sense to have a known good chdk firmware version for a given combination of camera/spreads/other components of a setup. This might take several problems out of the equation.

spideroak zipstream for downloading raw archives: benchmark and implement

[23:12] https://github.com/SpiderOak/ZipStream
[23:12] zipstream.py is a zip archive generator based on zipfile.py. It was created to generate a zip file on-the-fly for download in a web.py (http://webpy.org/) application. This is beneficial for when you want to provide a downloadable archive of a large collection of regular files, which would be infeasible to generate the archive prior to downloading.
[23:12] it basically streams a zipfile over http without ever touching the disk
[23:12] should work straight at is with Flask, 'send_file' accepts arbitrary generators as sources
[23:13] *as is
[23:13] did you get to that by my foolish remark?
[23:13] then I would be really happy
[23:13] say it inspired you, just to make me happy... please :p
[23:14] you're inspiring me all the time, mark .-)
[23:14] :-)
[23:17] the zipstream thing should basicly perform about as well as smb/ftp I guess...
[23:19] I wonder if it makes any sense for me to build a small python script that zipstreams 1000 1.5 Mb files
[23:19] and times that
[23:20] anyway, dumping that zipstream stuff into a github issue
[23:20] thanks .-)
[23:20] the benchmark should be fairly trivial to write
[23:20] http://flask.pocoo.org/docs/patterns/streaming/
[23:21] simple ~50loc flask application that exposes a single endpoint to stream some large zip file
[23:21] simple bash loop that runs wget a few dozen times and times the runs
[23:21] or a python script with 'requests' if you want to get fancy

Feature request: Allow zoom level to be set interactively

If I understand correctly, spreads currently sets the zoom level based on the value specified in ~/.config/spreads/config.yaml. Because books' dimensions are quite variable, it would be useful to be able to set zoom level on the fly. Modifying Mark's test_keypedal.sh script, I was able to implement this in the bash shell (see code below). Would it be easy to implement something similar in spreads? Thanks.

function set_zoom {
while true;
do
read -p "Press Z or X to set camera zoom, Q to exit..." user_inp$
case $user_input in
[Zz]* ) echo "Zooming cameras out...";
$PTPCAM --dev=$LEFTCAM --chdk='lua do click("zoom_out"$
$PTPCAM --dev=$RIGHTCAM --chdk='lua do click("zoom_out$
[Xx]* ) echo "Zooming cameras in...";
$PTPCAM --dev=$LEFTCAM --chdk='lua do click("zoom_in")$
$PTPCAM --dev=$RIGHTCAM --chdk='lua do click("zoom_in"$
[Qq]* ) break;;
* ) echo "Please press Z,X to set camera zoom or Q to exit..."
esac
done
sleep 1s
}

Configuration API for plugins

Currently, every plugin has direct access to the app's global configuration.
It would be better if this were encapsulated either in a proxy method or class attributes that provided spreads with information about the available configuration options for the plugin, their data type and a documentation string.
This way, plugins would no longer have to add command-line arguments themselves, but would just provide the configuration options at a higher level of abstraction. Every UI plugin could then generate a user interface from that data.

spreads PPA and APT-repo for easier deployment

Currently, users have to install spreads either via pip or from source.
A better way would be to provide packages for the most popular distributions.
To begin with, a PPA and an APT repository for Ubuntu-based and Debian distributions should cover a lot of users.

Make path semantics more clear

Currently the path semantics for plugins are quite unclear, it should be made more specific which folder is supposed to contain what.
This should be stated both in the documentation and the docstrings for the plugin API.

A working draft could be:

raw: Images as they were on the device, possibly rotated
done: Images after post-processing
out: Output files

Missed commands on one of the cameras

Sometimes a command issued to both connected cameras will only arrive at one of them. This happens quite frequently during the prepare_capture step when setting zoom levels and during the download step. It occurs less frequently during the shooting process itself.
This could be related to the usage of threading to toggle commands on both cameras at the same time.

Misconfiguration causes crashes, not obvious how to recover

On a fresh machine, do:
$ sudo pip install git+http://github.com/DIYBookScanner/spreads.git
$ sudo pip install git+http://github.com/matti-kariluoma/spreadsplug_dummydevice.git
$ # or any plugin, really
$ spread configure
$ # configure the selected plugin
$ sudo pip uninstall spreadsplug_dummydevice
$ spread
$ # errors before showing list of commands
$ spread wizard
$ # errors before showing the "path needed" message
$ spread configure
$ # Errors without letting me configure!!

You need to
$ rm ~/.config/spreads/config.yaml

Before your box is usable again. This file can move around too, dependent on ENVIRONMENT_VARS and OS.

test tesseract performance on arm

After some research on Arm performance, it seems we might very well get relatively decent tesseract ocr performance from Allwinner A20 based controller boards (dual core ARM Cortex A7.

arm performance: Cortex A57 (not yet available) > Cortex A15 > Cortex A7 > Cortex A9 > Cortex A8
price: Cortex A7 < Cortex A15
tesseract is thread safe

web interface: capture action needs feedback

(I am not a usability expert. This is at the core of the user experience, so it might really benefit from professional user experience advice.)

The capture action needs feedback. In my humble opinion, ideally:

When the cameras are being triggered, the capture action becomes temporarily unavailable. The web interface immediately visualises this. The best might be to grey out the capture button, plus some feedback in a fixed status area ("cameras capturing"?).
When the cameras are available for capturing a next shot, the user gets a non-obtrusive, clear user perceptible signal. I would make sure the capture button in the web interface immediately becomes available again, a camera emits a sound signal, and a fixed status area in the web app becomes ("cameras ready to capture"?)

Let's see how much of this feedback can be given within the limits of long polling. A sound signal from the cameras will probably get us a long way without the need for an immediate event in the browser...

Maybe something for Pieter to work on?

init script for spreads web application

The web application needs an init script for integration with Debian/Raspbian.

Something like http://kuttler.eu/code/debian-init-script-virtualenv-gunicorn-django/ might be a good starting point.

Feel free to assign that small task to me.

DeviceFeatures might be better as mix-in interfaces rather than enums

Not sure if necessary, but two pros I can think of:

'features' tuple need not be updated by hand on each device plugin
class names instead of ints when being logged/inspected

Add support for Canon A810 in CHDK driver

Documentation on installing ScanTailor enhanced

Hi again,

The tutorial on http://spreads.readthedocs.org/en/latest/tutorial.html suggests installation of scantailor enhanced using:

wget <scantailor-deb-url>
sudo dpkg -i <scantailor-deb>

But I was not able to locate a Debian package for scantailor enhanced. Could you please point me to the package you've used? I am trying to install the enhanced version in Ubuntu.

Thanks.

Downloading fails in both GUI and cli mode

(This is on Debian stable. From memory, since I don't have the devices available for the next week.)

In GUI:

detection, triggering and "live image" work
downloading triggers an error in the gui code. Sorry I can't be more precise.

In cli, similar error messages to comment 23322411 on #16 .

I will try to have a closer look myself later too. Thank you for your nice work!

Store orientation information in EXIF tag

Currently, the information about how to rotate images is stored as a string (left/right) in a JPEG comment field.
It would be better to store this information in the 'Orientation' field of the EXIF tags, e.g. using pexif or EXIF.py.
This way, there would no longer be any guesswork in the autorotate plugin.

Combine button in GUI does not work

The following error occurs when trying to trigger a new "combine" run in the GUI wizard:

Traceback (most recent call last):
  File "/home/tommy/.spreads/local/lib/python2.7/site-packages/spreadsplug/gui/gui.py", line 390, in doDownload
    self.combine_btn.clicked.connect(get_pluginmanager()['combine']
AttributeError: 'Extension' object has no attribute 'download'