Giter VIP home page Giter VIP logo

ocr's People

Contributors

comradekingu avatar mathiasconradt avatar morrisjobke avatar nextcloud-bot avatar rakekniven avatar scrutinizer-auto-fixer avatar stweil avatar valdnet avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ocr's Issues

Bug report: Infinite OCR queue when OCRWorker was not running

Nextcloud 11.0.0, OCR 2.0.0.

When trying to OCR and OCRWorker ist not running, the OCR queue is getting big and bigger.

After starting OCRWorker, freshly queued documents will be OCR'd, but the old the OCR queue from before won't vanish.

Also Apache log will be filled up a lot.
"GET /apps/ocr/status HTTP/1.1" 200 997 "

Can only disable the app at the moment until the bug is fixed.

907496ba4ea25049089e7ec3020def7cb378af5b

OCR app not showing in app list after downloaded

Bug report

Expected Behavior

I have tried to install this app by downloading tarball from Nextcloud appstore and by cloning the repo. It should appear as an inactive app.

Current Behavior

App can't be found in list.

Steps to Reproduce (for bugs)

  1. Install tesseract and ocrmypdf from package manager (Raspbian 8 unstable)
  2. Clone git repository / donwload from NC appstore.
  3. Open app settings page
  4. App isn't there.

Context

I have tried to install the app from the old Owncloud apstore. As it is not there, I changed my settings.php to point to NC appstore. I couldn't find it there so I tried to download the tar.gz from there. I noticed there were some problems with file permissions and I fixed them manually, but app doesn't appear in the list.
I have also tried to install app by cloning the repository. I find the same permission issue.
The permissions in app folder are:

  • -rw-r--r-- in files
  • drwx------ in folders
    I checked other apps permisssions and I see the folders permssions are drwxr-xr-x.
    Additionaly, when there is an ocr folder in nc/apps, the 'disabled' section can't load, there's only the spinning circle. Erasing the ocr folder fixes this.

Your Environment

  • OCR version used: 1.0.0
  • Browser Name and version: Firefox 50 and Chromium 53.0.2785.143
  • Operating System and version (desktop or mobile): Ubuntu 16.10
  • ownCloud/nextcloud version: 10.0.1
  • PHP version: 7.0.12-1 (cli)
  • Database version: Mysql 5.6.30
  • Are you using encryption: no

Systemd unit file for worker

Instead of running supervisor, i've added this to systemd:

/etc/systemd/system/nextcloud-ocr-worker.service:

[Unit]
Description=OCRWorker for Nextcloud OCR
After=apache2.service

[Service]
User=www-data
Group=www-data
ExecStart=/usr/bin/php /var/www/nextcloud/apps/ocr/worker/OCRWorker.php
Nice=19
Restart=always

[Install]
WantedBy=multi-user.target

OCR processing does not support shared folders

Bug report / Feature request

Expected Behavior

A file in a folder shared by another user should be able to be processed by the OCR app.

Current Behavior

Currently you may select files shared by other users and start the OCR process but it never finishes since the file is not accessible via nextcloud/data/user/current_user/files.

Possible Solution

There are two possilbe solutions: 1st: disallow OCR processing of shared files or 2nd implement a possibility to find out the real path of the file to process - and if writable by the current user, process it.

Steps to Reproduce (for bugs)

  1. Share a directory of user A to user B, writeable.
  2. Put a PDF in it.
  3. Log in as user B, select the file and
  4. choose OCR from the context menu.

Context

Your Environment

  • OCR version used: 2.0.0
  • Browser Name and version: All
  • Operating System and version (desktop or mobile): Linux/Mac
  • ownCloud/nextcloud version: Nextcloud 11
  • php 7
  • mysql
  • Are you using encryption: no

Log File Content (nextcloud/owncloud.log of the "data"-directory)

OCRWorker outputs "ERROR - File not found"

Build failing.

Fix the failing build. It only is an issue for php7 build.

FR: Advanced logging

Feature request

Expected Behavior

If a command/file fails to process the OCRWorker will recieve error messages which can be very helpful for bugtrakcing AND problem resolution for the user.

Current Behavior

Nothing of these messages is logged nor displayed during the status update to FAILED.

Possible Solution

The whole message which is recieved by the OCRWorker could be transferred (via tmp file solution) to occ command and get logged and shown as a "mouse-hover-tip" in the personal settings page.

Context

It would result in an easier bug tracking and provide much more information for the reason of a failed file.

Test with ubuntu

I will test the app with ubuntu 14.04 LTS on a VM in the next days.

OCRWorker.php fails to automatically ocr newly uploaded pdf file

OCRWorker.php is running as a systemd service, as described in the documentation, with the correct user and group www-data same as the web server.
Yet, OCRWorker fails to ocr a newly uploaded pdf file.

Bug report / Feature request

Expected Behavior

The documentation states that, OCRWorker.php is supposed to automatically ocr a newly added pdf file.

Current Behavior

It doesn't ocr the pdf file, even after waiting about an hour. However, the command in the overlay menu, does indeed ocr the file.
Would prefer the automatic ocr to be working.

Possible Solution

Where to look to troubleshoot this?
Is there a log?
Has anyone had the same issue with OCRWorker and solved it?

Steps to Reproduce (for bugs)

  1. Install the ocr app, and its prerequisites, as describe in the documentation.
  2. Install the OCRWorker.php daemon using the systemd option as detailed in the wiki.
  3. Upload a pdf file containing a scanned document, to owncloud/nextcloud.
  4. Watch the process list, OCRWorker.php is doing nothing, even after a long time.
  5. Click on the overlay menu, start the the ocr manually, watch the process list, the daemon OCRWorker and tesseract run with high load for 10 seconds, and a new file is produced adjacent to the original file, with _OCR.pdf suffix, correctly containing the ocr'ed data.

Context

Your Environment

  • OCR version used: latest version from here.
  • Browser Name and version: Firefox 52.0b3 latest version.
  • Operating System and version (desktop or mobile): Windows 10 latest updates. Linux Debian 8 server.
  • ownCloud/nextcloud version: (see ownCloud admin page or version.php) latest version nextcloud.
  • PHP version 7.0
  • Database version Mysql Mariadb 5.6
  • Are you using encryption: yes/no No.

Log File Content (nextcloud/owncloud.log of the "data"-directory)

App "Array" cannot be installed.

Bug report

Expected Behavior

Enabling the OCR app in Nextcloud 11 should succeed

Current Behavior

App "Array" cannot be installed because the following dependencies are not fulfilled: The command line tool ocrmypdf could not be found

Possible Solution

Steps to Reproduce (for bugs)

  1. Install Ocrmypdf-tess4 docker image
  2. Alias the ocrmypdf command using
    alias ocrmypdf='docker run --rm -v "$(pwd):/home/docker" ocrmypdf'
  3. Run ocrmypdf on the CLI. (Success.)
  4. Attempt to enable app in Nextcloud.

Context

Your Environment

  • OCR version used: ocrmypdf-tess4:latest docker image
  • Browser Name and version: Chrome latest
  • Operating System and version (desktop or mobile): Dell PE2900 running Ubuntu 1604, Docker Server v. 17.01.3-ce
  • ownCloud/nextcloud version: (see ownCloud admin page or version.php) 11.02
  • PHP version 7.0.17-2+deb.sury.org~xenial+1
  • Database version
  • Are you using encryption: yes/no: no.

Log File Content (nextcloud/owncloud.log of the "data"-directory)

[Feature Request] Setup prettier js development and add unit tests

at the moment the js code is really ugly and not testable at all. I want to setup a package.json inside the "js" folder:

  • js:

    • src
    • dist
    • test
  • Webpack as bundler for the dist/ocr-app.js file (npm run buildApp).

  • Webpack as bundler for the dist/ocr-personal.js file (npm run buildPersonal).

  • Jasmine for unit tests (separate tsconfig.json) (step: npm run test).

  • Restructure the client code to a good class structure.

  • Add unit tests for the classes/methods.

  • Adjust the php application to only include the ocr-app.js file in the dist folder.

  • Adjust the .travis.yml that the node_modules are installed and tests run properly.

OCR 2.0.0 does not regonize deu-frak

Nextcloud 11.0.0 on Ubuntu 16.04, java 1.8.0_111
Nextant 0.10.4, OCRmyPDF 4.3.3, tesseract-ocr 3.04.01

apt-get install tesseract-deu-frak

deu-frak does not appear in the list of installed languages

9a3a5d95c1859bce0b2717b7b0ad3b708529f0f5

Error when trying to process a pdf

Bug report

Expected Behavior

Current Behavior

Your Environment

  • OCR version used: 2.3.0
  • Browser Name and version: Chrome 56.0.2924.87 (64 bit)
  • Operating System and version (desktop or mobile): Windows 10 Pro 1607 / build: 14393.693
  • nextcloud version: 11.0.1
  • PHP version: 7.0.13-0ubuntu0.16.04.01
  • Database version: mysql Ver 14.14 Distrib 5.7.17
  • Are you using encryption: no

Log File Content (nextcloud/owncloud.log of the "data"-directory)

Error index OCP\AppFramework\Db\DoesNotExistException: Did expect one result but found none when executing: query "SELECT file_target FROM *PREFIX*share WHERE file_source = ? AND share_with = ? AND uid_owner = ?"; parameters Array ( [0] => 13392691 [1] => rascal [2] => local::/home/ ) ; limit ""; offset "" /var/www/nextcloud/lib/public/AppFramework/Db/Mapper.php - line 373: OCP\AppFramework\Db\Mapper->findOneQuery('SELECT file_tar...', Array, NULL, NULL) /var/www/nextcloud/apps/ocr/lib/Db/ShareMapper.php - line 42: OCP\AppFramework\Db\Mapper->findEntity('SELECT file_tar...', Array) /var/www/nextcloud/apps/ocr/lib/Service/OcrService.php - line 320: OCA\Ocr\Db\ShareMapper->find('13392691', 'rascal', 'local /home/') /var/www/nextcloud/apps/ocr/lib/Service/OcrService.php - line 173: OCA\Ocr\Service\OcrService->buildTargetForShared(Object(OCA\Ocr\Db\File)) /var/www/nextcloud/apps/ocr/lib/Controller/OcrController.php - line 74: OCA\Ocr\Service\OcrService->process(Array, Array) /var/www/nextcloud/apps/ocr/lib/Controller/Errors.php - line 35: OCA\Ocr\Controller\OcrController->OCA\Ocr\Controller\{closure}() /var/www/nextcloud/apps/ocr/lib/Controller/OcrController.php - line 75: OCA\Ocr\Controller\OcrController->handleNotFound(Object(Closure)) [internal function] OCA\Ocr\Controller\OcrController->process(Array, Array) /var/www/nextcloud/lib/private/AppFramework/Http/Dispatcher.php - line 160: call_user_func_array(Array, Array) /var/www/nextcloud/lib/private/AppFramework/Http/Dispatcher.php - line 90: OC\AppFramework\Http\Dispatcher->executeController(Object(OCA\Ocr\Controller\OcrController), 'process') /var/www/nextcloud/lib/private/AppFramework/App.php - line 114: OC\AppFramework\Http\Dispatcher->dispatch(Object(OCA\Ocr\Controller\OcrController), 'process') /var/www/nextcloud/lib/private/AppFramework/Routing/RouteActionHandler.php - line 47: OC\AppFramework\App main('OcrController', 'process', Object(OC\AppFramework\DependencyInjection\DIContainer), Array) [internal function] OC\AppFramework\Routing\RouteActionHandler->__invoke(Array) /var/www/nextcloud/lib/private/Route/Router.php - line 299: call_user_func(Object(OC\AppFramework\Routing\RouteActionHandler), Array) /var/www/nextcloud/lib/base.php - line 1010: OC\Route\Router->match('/apps/ocr') /var/www/nextcloud/index.php - line 40: OC handleRequest() {main} 2 minutes ago Error ocr Exception during ocr service function processing: {"Exception":"OCP\\AppFramework\\Db\\DoesNotExistException","Message":"Did expect one result but found none when executing: query \"SELECT file_target FROM *PREFIX*share WHERE file_source = ? AND share_with = ? AND uid_owner = ?\"; parameters Array\n(\n [0] => 13392691\n [1] => rascal\n [2] => local::\/home\/\n)\n; limit \"\"; offset \"\"","Code":0,"Trace":"#0 \/var\/www\/nextcloud\/lib\/public\/AppFramework\/Db\/Mapper.php(373): OCP\\AppFramework\\Db\\Mapper->findOneQuery('SELECT file_tar...', Array, NULL, NULL)\n#1 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Db\/ShareMapper.php(42): OCP\\AppFramework\\Db\\Mapper->findEntity('SELECT file_tar...', Array)\n#2 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Service\/OcrService.php(320): OCA\\Ocr\\Db\\ShareMapper->find('13392691', 'rascal', 'local::\/home\/')\n#3 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Service\/OcrService.php(173): OCA\\Ocr\\Service\\OcrService->buildTargetForShared(Object(OCA\\Ocr\\Db\\File))\n#4 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Controller\/OcrController.php(74): OCA\\Ocr\\Service\\OcrService->process(Array, Array)\n#5 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Controller\/Errors.php(35): OCA\\Ocr\\Controller\\OcrController->OCA\\Ocr\\Controller\\{closure}()\n#6 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Controller\/OcrController.php(75): OCA\\Ocr\\Controller\\OcrController->handleNotFound(Object(Closure))\n#7 [internal function]: OCA\\Ocr\\Controller\\OcrController->process(Array, Array)\n#8 \/var\/www\/nextcloud\/lib\/private\/AppFramework\/Http\/Dispatcher.php(160): call_user_func_array(Array, Array)\n#9 \/var\/www\/nextcloud\/lib\/private\/AppFramework\/Http\/Dispatcher.php(90): OC\\AppFramework\\Http\\Dispatcher->executeController(Object(OCA\\Ocr\\Controller\\OcrController), 'process')\n#10 \/var\/www\/nextcloud\/lib\/private\/AppFramework\/App.php(114): OC\\AppFramework\\Http\\Dispatcher->dispatch(Object(OCA\\Ocr\\Controller\\OcrController), 'process')\n#11 \/var\/www\/nextcloud\/lib\/private\/AppFramework\/Routing\/RouteActionHandler.php(47): OC\\AppFramework\\App::main('OcrController', 'process', Object(OC\\AppFramework\\DependencyInjection\\DIContainer), Array)\n#12 [internal function]: OC\\AppFramework\\Routing\\RouteActionHandler->__invoke(Array)\n#13 \/var\/www\/nextcloud\/lib\/private\/Route\/Router.php(299): call_user_func(Object(OC\\AppFramework\\Routing\\RouteActionHandler), Array)\n#14 \/var\/www\/nextcloud\/lib\/base.php(1010): OC\\Route\\Router->match('\/apps\/ocr')\n#15 \/var\/www\/nextcloud\/index.php(40): OC::handleRequest()\n#16 {main}","File":"\/var\/www\/nextcloud\/lib\/public\/AppFramework\/Db\/Mapper.php","Line":289}

wiki supervisor example needs www-data user

Bug report

Following the Supervisord example results in the file sitting in the queue.

Expected Behavior

To process the files

Current Behavior

The file stays in a pending mode

Possible Solution

I added user=www-data to supervisord.conf which fixed it.

Steps to Reproduce (for bugs)

Install per current instructions

Context

Your Environment

  • OCR version used: 2.0
  • Browser Name and version: Chrome 55.0.2883.87 (64-bit)
  • Operating System and version (desktop or mobile): elementaryOS Loki
  • ownCloud/nextcloud version: 11.0.1 (stable)
  • PHP version: 7.0.13-0ubuntu0.16.10.1
  • Database version: mysql Ver 14.14 Distrib 5.7.17
  • Are you using encryption: no

OCR needs undocumented installation steps to work on newer systems (e. g. Debian Stretch)

An installation of the OCR app on Debian Stretch fails to work. Users are able to start the OCR with a selected language, but then see a never ending sequence of messages reporting that "Temp file does not exist.". This sequence can only be stopped by hacking the database and removing the row related to the OCR job. Technically, the code should reset the job status before raising the exception (see pull request #64).

The file which was not found exists in /tmp/ and has the correct permissions. Nevertheless, the PHP code claims that the file does not exist. This is caused by a security feature of systemd: apache2.service gets its own /tmp directory. Therefore /tmp seen by PHP code running in Apache2 is not the same as /tmp seen by the OCRWorker.php process.

It is possible to disable the security feature with a private /tmp for apache2.service, but that would be a bad solution. The better solution is using systemd to start OCRWorker.php and tell it to share /tmp with apache2.service. This works for me.

(NC 10.0.1) while installing: Array to string conversion

I downloaded the master.zip and the 0.8.8: same error.

I put the folder in apps/ and open the App webpage. Can't display the list of disabled appsand I got this message:

Array to string conversion at /home/www/kh.ro/dev/nextcloud-dev/settings/Controller/AppSettingsController.php#238

I had to remove the german version in the info.xml (name/summary/description) :

<name>OCR</name>
<summary >Character recoginition for your images and pdf files.</summary>
<description><![CDATA[# Description
[![Build Status](https://travis-ci.org/janis91/ocr.svg?branch=master)](https://travis-ci.org/janis91/ocr) [![Scrutinizer Code Quality](https://scrutinizer-ci.com/g/janis91/ocr/badges/quality-score.png?b=master)](https://scrutinizer-ci.c$
**This software is in beta phase 
[...]

[Feature Request] Offloading OCR processing to dedicated server

Feature request
This may already be available and implicit in docs and I've just failed to understand it.
Make it possible to have the OCR work running on another server to help split workloads

On larger installations (250+ users, 1m+ docs) the servers have been specified for the job envisaged. Adding in (the very useful) OCR functionality puts additional load on the frontend server and potentially impacts UX for all users; offloading it reduces impact and also means the OCR server is more easily updated/optimised.

OCR does not process files in external storage (local) folder

Bug report

Expected Behavior

The file should be OCRed as in any other common Nextcloud folder.

Current Behavior

An error is outputted in the top area of the web interface and the file is not processed.

Possible Solution

Steps to Reproduce (for bugs)

  1. Install the "External storage support" app
  2. Create a local folder in the server, mine (/data/data) is out of the /var/www/nextcloud path, this folder should be owned by www-data user and group, for more details please check https://docs.nextcloud.com/server/11/admin_manual/configuration_files/external_storage/local.html
  3. In the Admin > External storages add the previously configured folder
  4. I have added a file through the Nextcloud Mac OS X client
  5. I tried to OCR the file via web interface and it showed an error.

Context

I'm trying to OCR a file in a external storage local folder to made it available to other users with access to this folder.

Your Environment

  • OCR version used: 2.3.0
  • Browser Name and version: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:51.0) Gecko/20100101 Firefox/51.0
  • Operating System and version (desktop or mobile): Mac OS X 10.10.5
  • ownCloud/nextcloud version: Nextcloud 11.0.1 (stable)
  • PHP version: 7.0.13
  • Database version: PostgreSQL 9.5.5 on i686-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.2) 5.4.0 20160609, 32-bit
  • Are you using encryption: no

Log File Content (nextcloud.log of the "data"-directory)

{"reqId":"2cHVIjfj1kBYuU64e6xu","remoteAddr":"192.168.1.104","app":"ocr","message":"Exception during ocr service function processing: {"Exception":"OCP\\AppFramework\\Db\\DoesNotExistException","Message":"Did expect one result but found none when executing: query \"SELECT file_target FROM PREFIXshare WHERE file_source = ? AND share_with = ? AND uid_owner = ?\"; parameters Array\n(\n [0] => 320\n [1] => [email protected]\n [2] => local::\/data\/data\/\n)\n; limit \"\"; offset \"\"","Code":0,"Trace":"#0 \/var\/www\/nextcloud\/lib\/public\/AppFramework\/Db\/Mapper.php(373): OCP\\AppFramework\\Db\\Mapper->findOneQuery('SELECT file_tar...', Array, NULL, NULL)\n#1 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Db\/ShareMapper.php(42): OCP\\AppFramework\\Db\\Mapper->findEntity('SELECT file_tar...', Array)\n#2 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Service\/OcrService.php(320): OCA\\Ocr\\Db\\ShareMapper->find(320, '[email protected]...', 'local::\/data\/da...')\n#3 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Service\/OcrService.php(173): OCA\\Ocr\\Service\\OcrService->buildTargetForShared(Object(OCA\\Ocr\\Db\\File))\n#4 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Controller\/OcrController.php(74): OCA\\Ocr\\Service\\OcrService->process(Array, Array)\n#5 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Controller\/Errors.php(35): OCA\\Ocr\\Controller\\OcrController->OCA\\Ocr\\Controller\\{closure}()\n#6 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Controller\/OcrController.php(75): OCA\\Ocr\\Controller\\OcrController->handleNotFound(Object(Closure))\n#7 [internal function]: OCA\\Ocr\\Controller\\OcrController->process(Array, Array)\n#8 \/var\/www\/nextcloud\/lib\/private\/AppFramework\/Http\/Dispatcher.php(160): call_user_func_array(Array, Array)\n#9 \/var\/www\/nextcloud\/lib\/private\/AppFramework\/Http\/Dispatcher.php(90): OC\\AppFramework\\Http\\Dispatcher->executeController(Object(OCA\\Ocr\\Controller\\OcrController), 'process')\n#10 \/var\/www\/nextcloud\/lib\/private\/AppFramework\/App.php(114): OC\\AppFramework\\Http\\Dispatcher->dispatch(Object(OCA\\Ocr\\Controller\\OcrController), 'process')\n#11 \/var\/www\/nextcloud\/lib\/private\/AppFramework\/Routing\/RouteActionHandler.php(47): OC\\AppFramework\\App::main('OcrController', 'process', Object(OC\\AppFramework\\DependencyInjection\\DIContainer), Array)\n#12 [internal function]: OC\\AppFramework\\Routing\\RouteActionHandler->__invoke(Array)\n#13 \/var\/www\/nextcloud\/lib\/private\/Route\/Router.php(299): call_user_func(Object(OC\\AppFramework\\Routing\\RouteActionHandler), Array)\n#14 \/var\/www\/nextcloud\/lib\/base.php(1010): OC\\Route\\Router->match('\/apps\/ocr')\n#15 \/var\/www\/nextcloud\/index.php(40): OC::handleRequest()\n#16 {main}","File":"\/var\/www\/nextcloud\/lib\/public\/AppFramework\/Db\/Mapper.php","Line":289}","level":3,"time":"2017-02-14T18:04:15+00:00","method":"POST","url":"/nextcloud/index.php/apps/ocr","user":"[email protected]","version":"11.0.1.2"}
{"reqId":"2cHVIjfj1kBYuU64e6xu","remoteAddr":"192.168.1.104","app":"ocr","message":"Exception during ocr service function processing: {"Exception":"OCP\\AppFramework\\Db\\DoesNotExistException","Message":"Did expect one result but found none when executing: query \"SELECT file_target FROM PREFIXshare WHERE file_source = ? AND share_with = ? AND uid_owner = ?\"; parameters Array\n(\n [0] => 320\n [1] => [email protected]\n [2] => local::\/data\/data\/\n)\n; limit \"\"; offset \"\"","Code":0,"Trace":"#0 \/var\/www\/nextcloud\/lib\/public\/AppFramework\/Db\/Mapper.php(373): OCP\\AppFramework\\Db\\Mapper->findOneQuery('SELECT file_tar...', Array, NULL, NULL)\n#1 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Db\/ShareMapper.php(42): OCP\\AppFramework\\Db\\Mapper->findEntity('SELECT file_tar...', Array)\n#2 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Service\/OcrService.php(320): OCA\\Ocr\\Db\\ShareMapper->find(320, '[email protected]...', 'local::\/data\/da...')\n#3 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Service\/OcrService.php(173): OCA\\Ocr\\Service\\OcrService->buildTargetForShared(Object(OCA\\Ocr\\Db\\File))\n#4 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Controller\/OcrController.php(74): OCA\\Ocr\\Service\\OcrService->process(Array, Array)\n#5 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Controller\/Errors.php(35): OCA\\Ocr\\Controller\\OcrController->OCA\\Ocr\\Controller\\{closure}()\n#6 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Controller\/OcrController.php(75): OCA\\Ocr\\Controller\\OcrController->handleNotFound(Object(Closure))\n#7 [internal function]: OCA\\Ocr\\Controller\\OcrController->process(Array, Array)\n#8 \/var\/www\/nextcloud\/lib\/private\/AppFramework\/Http\/Dispatcher.php(160): call_user_func_array(Array, Array)\n#9 \/var\/www\/nextcloud\/lib\/private\/AppFramework\/Http\/Dispatcher.php(90): OC\\AppFramework\\Http\\Dispatcher->executeController(Object(OCA\\Ocr\\Controller\\OcrController), 'process')\n#10 \/var\/www\/nextcloud\/lib\/private\/AppFramework\/App.php(114): OC\\AppFramework\\Http\\Dispatcher->dispatch(Object(OCA\\Ocr\\Controller\\OcrController), 'process')\n#11 \/var\/www\/nextcloud\/lib\/private\/AppFramework\/Routing\/RouteActionHandler.php(47): OC\\AppFramework\\App::main('OcrController', 'process', Object(OC\\AppFramework\\DependencyInjection\\DIContainer), Array)\n#12 [internal function]: OC\\AppFramework\\Routing\\RouteActionHandler->__invoke(Array)\n#13 \/var\/www\/nextcloud\/lib\/private\/Route\/Router.php(299): call_user_func(Object(OC\\AppFramework\\Routing\\RouteActionHandler), Array)\n#14 \/var\/www\/nextcloud\/lib\/base.php(1010): OC\\Route\\Router->match('\/apps\/ocr')\n#15 \/var\/www\/nextcloud\/index.php(40): OC::handleRequest()\n#16 {main}","File":"\/var\/www\/nextcloud\/lib\/public\/AppFramework\/Db\/Mapper.php","Line":289}","level":3,"time":"2017-02-14T18:04:15+00:00","method":"POST","url":"/nextcloud/index.php/apps/ocr","user":"[email protected]","version":"11.0.1.2"}
{"reqId":"2cHVIjfj1kBYuU64e6xu","remoteAddr":"192.168.1.104","app":"index","message":"Exception: {"Exception":"OCP\\AppFramework\\Db\\DoesNotExistException","Message":"Did expect one result but found none when executing: query \"SELECT file_target FROM PREFIXshare WHERE file_source = ? AND share_with = ? AND uid_owner = ?\"; parameters Array\n(\n [0] => 320\n [1] => [email protected]\n [2] => local::\/data\/data\/\n)\n; limit \"\"; offset \"\"","Code":0,"Trace":"#0 \/var\/www\/nextcloud\/lib\/public\/AppFramework\/Db\/Mapper.php(373): OCP\\AppFramework\\Db\\Mapper->findOneQuery('SELECT file_tar...', Array, NULL, NULL)\n#1 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Db\/ShareMapper.php(42): OCP\\AppFramework\\Db\\Mapper->findEntity('SELECT file_tar...', Array)\n#2 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Service\/OcrService.php(320): OCA\\Ocr\\Db\\ShareMapper->find(320, '[email protected]...', 'local::\/data\/da...')\n#3 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Service\/OcrService.php(173): OCA\\Ocr\\Service\\OcrService->buildTargetForShared(Object(OCA\\Ocr\\Db\\File))\n#4 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Controller\/OcrController.php(74): OCA\\Ocr\\Service\\OcrService->process(Array, Array)\n#5 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Controller\/Errors.php(35): OCA\\Ocr\\Controller\\OcrController->OCA\\Ocr\\Controller\\{closure}()\n#6 \/var\/www\/nextcloud\/apps\/ocr\/lib\/Controller\/OcrController.php(75): OCA\\Ocr\\Controller\\OcrController->handleNotFound(Object(Closure))\n#7 [internal function]: OCA\\Ocr\\Controller\\OcrController->process(Array, Array)\n#8 \/var\/www\/nextcloud\/lib\/private\/AppFramework\/Http\/Dispatcher.php(160): call_user_func_array(Array, Array)\n#9 \/var\/www\/nextcloud\/lib\/private\/AppFramework\/Http\/Dispatcher.php(90): OC\\AppFramework\\Http\\Dispatcher->executeController(Object(OCA\\Ocr\\Controller\\OcrController), 'process')\n#10 \/var\/www\/nextcloud\/lib\/private\/AppFramework\/App.php(114): OC\\AppFramework\\Http\\Dispatcher->dispatch(Object(OCA\\Ocr\\Controller\\OcrController), 'process')\n#11 \/var\/www\/nextcloud\/lib\/private\/AppFramework\/Routing\/RouteActionHandler.php(47): OC\\AppFramework\\App::main('OcrController', 'process', Object(OC\\AppFramework\\DependencyInjection\\DIContainer), Array)\n#12 [internal function]: OC\\AppFramework\\Routing\\RouteActionHandler->__invoke(Array)\n#13 \/var\/www\/nextcloud\/lib\/private\/Route\/Router.php(299): call_user_func(Object(OC\\AppFramework\\Routing\\RouteActionHandler), Array)\n#14 \/var\/www\/nextcloud\/lib\/base.php(1010): OC\\Route\\Router->match('\/apps\/ocr')\n#15 \/var\/www\/nextcloud\/index.php(40): OC::handleRequest()\n#16 {main}","File":"\/var\/www\/nextcloud\/lib\/public\/AppFramework\/Db\/Mapper.php","Line":289}","level":3,"time":"2017-02-14T18:04:15+00:00","method":"POST","url":"/nextcloud/index.php/apps/ocr","user":"[email protected]","version":"11.0.1.2"}

[Feature Request] Folder processing

Mark a folder and process it. Maybe even specify a folder in the user settings, which then can be processed automatically by the cron jobs in nextcloud.

Allow external processing

It would be useful to have the ability for the app to use instances of the required dependencies:
a) OCRmyPDF
b) tesseract-ocr
Present at an external site, if this is possible.
This would benefit those using hosted Nextcloud versions where they cannot control the software on the server.

Missing support for JPEG2000 images

Currently JPEG2000 images are not supported.

Those files are typically named *.jp2 and use the mime type image/jp2 which is supported by Tesseract 3.x and could be added to the OCR app.

Some more image formats are also supported by Tesseract 3.x and missing in the OCR app.

PHP error during install: Array to string conversion

Bug report

Expected Behavior

I'm trying to install the OCR app on a Nextcloud 11.0.0 installation with PHP 7.0.
Tesseract and Ocrmypdf commands are installed and executable for any user:

% which ocrmypdf
/usr/local/bin/ocrmypdf
% ls -l /usr/local/bin/ocrmypdf
-rwxr-xr-x  1 root  wheel  242 Dec 20 19:53 /usr/local/bin/ocrmypdf

% which tesseract
/usr/local/bin/tesseract
% ls -l /usr/local/bin/tesseract
-rwxr-xr-x  1 root  wheel  21280 Dec 15 01:42 /usr/local/bin/tesseract

Current Behavior

When I click "Activate" in the Nextcloud App Admin, I get the following error message:

App "Array" cannot be installed because the following dependencies are not fulfilled: The command line tool ocrmypdf could not be found The command line tool tesseract could not be found

In the Admin/Logging section I can then see two error messages:

Error PHP Array to string conversion at /next/data/nextcloud/lib/private/legacy/l10n/string.php#72

Error core App "Array" cannot be installed because the following dependencies are not fulfilled: The command line tool ocrmypdf could not be found The command line tool tesseract could not be found

Environment

  • Nextcloud 11.0.0
  • Postgres 9.3
  • PHP 7.0
  • FreeBSD 11-stable
  • Ocrmypdf 4.3.4
  • Local Encryption turned off
tesseract 3.04.01
 leptonica-1.72
  libgif 5.1.3 : libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.6.23+apng : libtiff 4.0.6 : zlib 1.2.8 : libwebp 0.5.0 : libopenjp2 2.1.0

Log File Content (nextcloud/owncloud.log of the "data"-directory)

{"reqId":"wRghPv69b1oS1OA4rmt7","remoteAddr":"x.x.x.x","app":"PHP","message":"Array to string conversion at \/next\/data\/nextcloud\/lib\/private\/legacy\/l10n\/string.php#72","level":3,"time":"2016-12-21T09:35:48+00:00","method":"POST","url":"\/nextcloud\/index.php\/settings\/ajax\/enableapp.php","user":"x","version":"11.0.0.10"}

{"reqId":"wRghPv69b1oS1OA4rmt7","remoteAddr":"x.x.x.x","app":"core","message":"App \"Array\" cannot be installed because the following dependencies are not fulfilled: The command line tool ocrmypdf could not be found\nThe command line tool tesseract could not be found","level":3,"time":"2016-12-21T09:35:48+00:00","method":"POST","url":"\/nextcloud\/index.php\/settings\/ajax\/enableapp.php","user":"x","version":"11.0.0.10"}

Adding tests for better test coverage

Feature request

Expected Behavior

More tests which cover the whole ocr service part. not only a little of them.

Current Behavior

Most parts of ocr service are left out, because of global php functions which cannot be mocked.

Possible Solution

Maybe we can build up a better environment where the global functions can be processed correctly in travis. (As for now the processing is available in local dev env only)

Context

The project becomes more reliable with this.

OCR doesn't start for my files in a shared directory

Bug report

NC Version 11.01, current OCR app.

When I want to OCR a file that has been uploaded by me, but resides on a directory that has been shared to me, OCR will not start but display an error in the log:

OCP\AppFramework\Db\DoesNotExistException: Did expect one result but found none when executing: query "SELECT file_target FROM *PREFIX*share WHERE file_source = ? AND share_with = ? AND uid_owner = ?"; parameters Array ( [0] => 168228 [1] => current.user[2] => OwnerOfSharedDir ) ; limit ""; offset ""

When I login as (OwnerOfSharedDir), OCR works fine.

Is this reproducible or shall I do some more tests and provide more info here?

Breaks Nextcloud 11

Bug report

Expected Behavior

I should be able to keep using Nextcloud even if the app is broken.

Current Behavior

It's impossible to use the Files app

Possible Solution

Steps to Reproduce (for bugs)

  1. Enable app

Context

Add exception handling to not break Nextcloud when a serious issue occurs.

Your Environment

  • OCR version used:
tesseract 3.04.01
 leptonica-1.72
  libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.1) : libpng 1.6.28+apng : libtiff 4.0.7 : zlib 1.2.8 : libwebp 0.5.2 : libopenjp2 2.1.2
  • ownCloud/nextcloud version: 11.0.1
  • PHP version: 7.1
  • Database version
  • Are you using encryption: no

Log File Content (nextcloud/owncloud.log of the "data"-directory)

An unhandled exception has been thrown:
Error: Call to undefined function OCA\Ocr\Service\msg_get_queue() in customapps/ocr/lib/Service/QueueService.php:70
Stack trace:
#0 [internal function]: OCA\Ocr\Service\QueueService->__construct(Object(OCA\Ocr\Db\OcrStatusMapper), Object(OC\AllConfig), Object(OC\L10N\L10N), Object(OC\Log))
#1 lib/private/AppFramework/Utility/SimpleContainer.php(79): ReflectionClass->newInstanceArgs(Array)
#2 An unhandled exception has been thrown:
Error: Call to undefined function OCA\Ocr\Service\msg_get_queue() in customapps/ocr/lib/Service/QueueService.php:70
Stack trace:
#0 [internal function]: OCA\Ocr\Service\QueueService->__construct(Object(OCA\Ocr\Db\OcrStatusMapper), Object(OC\AllConfig), Object(OC\L10N\L10N), Object(OC\Log))
#1 lib/private/AppFramework/Utility/SimpleContainer.php(79): ReflectionClass->newInstanceArgs(Array)
#2 lib/private/AppFramework/Utility/SimpleContainer.php(96): OC\AppFramework\Utility\SimpleContainer->buildClass(Object(ReflectionClass))
#3 lib/private/AppFramework/Utility/SimpleContainer.php(117): OC\AppFramework\Utility\SimpleContainer->resolve('OCA\\Ocr\\Service...')
#4 lib/private/AppFramework/DependencyInjection/DIContainer.php(544): OC\AppFramework\Utility\SimpleContainer->query('OCA\\Ocr\\Service...')
#5 lib/private/AppFramework/Utility/SimpleContainer.php(66): OC\AppFramework\DependencyInjection\DIContainer->query('OCA\\Ocr\\Service...')
#6 lib/private/AppFramework/Utility/SimpleContainer.php(96): OC\AppFramework\Utility\SimpleContainer->buildClass(Object(ReflectionClass))
#7 lib/private/AppFramework/Utility/SimpleContainer.php(117): OC\AppFramework\Utility\SimpleContainer->resolve('OCA\\Ocr\\Service...')
#8 lib/private/AppFramework/DependencyInjection/DIContainer.php(544): OC\AppFramework\Utility\SimpleContainer->query('OCA\\Ocr\\Service...')
#9 lib/private/AppFramework/Utility/SimpleContainer.php(66): OC\AppFramework\DependencyInjection\DIContainer->query('OCA\\Ocr\\Service...')
#10 lib/private/AppFramework/Utility/SimpleContainer.php(96): OC\AppFramework\Utility\SimpleContainer->buildClass(Object(ReflectionClass))
#11 lib/private/AppFramework/Utility/SimpleContainer.php(117): OC\AppFramework\Utility\SimpleContainer->resolve('OCA\\Ocr\\Command...')
#12 lib/private/AppFramework/DependencyInjection/DIContainer.php(544): OC\AppFramework\Utility\SimpleContainer->query('OCA\\Ocr\\Command...')
#13 customapps/ocr/appinfo/register_command.php(18): OC\AppFramework\DependencyInjection\DIContainer->query('OCA\\Ocr\\Command...')
#14 lib/private/Console/Application.php(119): require('/backyard/yourmum/d...')
#15 console.php(89): OC\Console\Application->loadCommands(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#16 occ(11): require_once('/backyard/yourmum/d...')
#17 {main}/lib/private/AppFramework/Utility/SimpleContainer.php(96): OC\AppFramework\Utility\SimpleContainer->buildClass(Object(ReflectionClass))
#3 lib/private/AppFramework/Utility/SimpleContainer.php(117): OC\AppFramework\Utility\SimpleContainer->resolve('OCA\\Ocr\\Service...')
#4 lib/private/AppFramework/DependencyInjection/DIContainer.php(544): OC\AppFramework\Utility\SimpleContainer->query('OCA\\Ocr\\Service...')
#5 lib/private/AppFramework/Utility/SimpleContainer.php(66): OC\AppFramework\DependencyInjection\DIContainer->query('OCA\\Ocr\\Service...')
#6 lib/private/AppFramework/Utility/SimpleContainer.php(96): OC\AppFramework\Utility\SimpleContainer->buildClass(Object(ReflectionClass))
#7 lib/private/AppFramework/Utility/SimpleContainer.php(117): OC\AppFramework\Utility\SimpleContainer->resolve('OCA\\Ocr\\Service...')
#8 lib/private/AppFramework/DependencyInjection/DIContainer.php(544): OC\AppFramework\Utility\SimpleContainer->query('OCA\\Ocr\\Service...')
#9 lib/private/AppFramework/Utility/SimpleContainer.php(66): OC\AppFramework\DependencyInjection\DIContainer->query('OCA\\Ocr\\Service...')
#10 lib/private/AppFramework/Utility/SimpleContainer.php(96): OC\AppFramework\Utility\SimpleContainer->buildClass(Object(ReflectionClass))
#11 lib/private/AppFramework/Utility/SimpleContainer.php(117): OC\AppFramework\Utility\SimpleContainer->resolve('OCA\\Ocr\\Command...')
#12 lib/private/AppFramework/DependencyInjection/DIContainer.php(544): OC\AppFramework\Utility\SimpleContainer->query('OCA\\Ocr\\Command...')
#13 customapps/ocr/appinfo/register_command.php(18): OC\AppFramework\DependencyInjection\DIContainer->query('OCA\\Ocr\\Command...')
#14 lib/private/Console/Application.php(119): require('/backyard/yourmum/d...')
#15 console.php(89): OC\Console\Application->loadCommands(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#16 occ(11): require_once('/backyard/yourmum/d...')
#17 {main}

Grammar error

throw new NotFoundException($this->l10n->t('Empty passed parameters.'));

Should be Empty parameters passed instead of Empty passed parameters.
You don't say Leere übergeben Parameter in german either ;)

OCR: 1 currently pending file in queue

Bug report

Expected Behavior

Don't know

Current Behavior

  • Started OCR scan for a PDF file
  • Get "OCR: 1 currently pending file in queue" displayed on top
  • Started next OCR scan
  • Get "OCR: 2 currently pending file in queue" displayed on top
  • and so on .....

Possible Solution

No Idea

Steps to Reproduce (for bugs)

Just install it as below.

Context

Your Environment

  • OCR version used: 2.0.0
  • Browser Name and version: Firefox 50.1.0
  • Operating System and version (desktop or mobile): Ubuntu 16.04
  • Nextcloud version: 11.0.0
  • PHP version: PHP 7.0.8-0ubuntu0.16.04.3
  • Database version: mysql Ver 14.14 Distrib 5.7.16
  • Are you using encryption: no

Access for user www-data tested: OK

  • sudo -u www-data tesseract
  • sudo -u www-data ocrmypdf

Log File Content (nextcloud/owncloud.log of the "data"-directory)

  • 4 older OCR scans in the queue
  • Uploaded file "PHA--2054Z0.pdf" to /
  • Started OCR scan - German, User "xxxAdmin" in group admins

Level App Message Time
Debug ocr Following status objects failed: [] 2016-12-28T14:11:52+0100
Debug ocr Find processed ocr files and put them to the right dirs. 2016-12-28T14:11:52+0100
Debug ocr Following status objects failed: [] 2016-12-28T14:11:50+0100
Debug ocr Find processed ocr files and put them to the right dirs. 2016-12-28T14:11:50+0100
Debug ocr Following status objects failed: [] 2016-12-28T14:11:47+0100
Debug ocr Find processed ocr files and put them to the right dirs. 2016-12-28T14:11:47+0100
Debug ocr Following status objects failed: [] 2016-12-28T14:11:45+0100
Debug ocr Find processed ocr files and put them to the right dirs. 2016-12-28T14:11:45+0100
Debug ocr Client message: "{"type":"mypdf","datadirectory":"\/var\/ocdata","path":"\/xxxAdmin\/files\/PHA--2054Z0.pdf","tempfile":"\/var\/ocdata\/upload-tmp\/oc_tmp_B6AFT0","language":"deu","statusid":5,"occdir":"\/var\/www\/nextcloud"}" 2016-12-28T14:11:43+0100
Debug ocr Fetched languages: ["ita","fra","osd","deu","spa","equ","por","eng"] 2016-12-28T14:11:43+0100
Debug ocr Fetching languages. 2016-12-28T14:11:42+0100
Debug ocr Will now process files: [{"name":"PHA--2054Z0.pdf","path":"/","type":"file","mimetype":"application/pdf"}] with language: "deu" 2016-12-28T14:11:42+0100
Debug ocr Following status objects failed: [] 2016-12-28T14:11:41+0100
Debug ocr Find processed ocr files and put them to the right dirs. 2016-12-28T14:11:41+0100
Debug ocr Following status objects failed: [] 2016-12-28T14:11:36+0100
Debug ocr Find processed ocr files and put them to the right dirs. 2016-12-28T14:11:36+0100
Debug ocr Following status objects failed: [] 2016-12-28T14:11:31+0100
Debug ocr Find processed ocr files and put them to the right dirs. 2016-12-28T14:11:31+0100
Info admin_audit File written to: "//PHA--2054Z0.pdf" 2016-12-28T14:11:30+0100
Info admin_audit File created: "//PHA--2054Z0.pdf" 2016-12-28T14:11:30+0100

Installation log

Installation steps

  1. sudo apt-get install python3-pip
  2. sudo pip3 install --upgrade pip
  3. sudo apt-get install libffi-dev
  4. sudo pip3 install ocrmypdf
  5. sudo apt-get install tesseract-ocr tesseract-ocr-deu
  6. sudo apt-get install tesseract-ocr-spa
  7. sudo apt-get install tesseract-ocr-por
  8. sudo apt-get install tesseract-ocr-ndl
  9. sudo apt-get install tesseract-ocr-ita
  10. sudo apt-get install tesseract-ocr-fra
  11. sudo apt-get install tesseract-ocr-eng
  12. sudo apt-get install tesseract-ocr-deu-frak

Installation output

ocadmin@owncloud:~$ sudo apt-get install python3-pip
[sudo] password for ocadmin: 
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following additional packages will be installed:
  build-essential g++ g++-5 libpython3-dev libpython3.5-dev libstdc++-5-dev python3-dev python3-setuptools python3-wheel
  python3.5-dev
Suggested packages:
  g++-multilib g++-5-multilib gcc-5-doc libstdc++6-5-dbg libstdc++-5-doc python-setuptools-doc
The following NEW packages will be installed:
  build-essential g++ g++-5 libpython3-dev libpython3.5-dev libstdc++-5-dev python3-dev python3-pip python3-setuptools
  python3-wheel python3.5-dev
0 upgraded, 11 newly installed, 0 to remove and 0 not upgraded.
Need to get 47.7 MB of archives.
After this operation, 94.3 MB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 http://se.archive.ubuntu.com/ubuntu xenial-updates/main amd64 libstdc++-5-dev amd64 5.4.0-6ubuntu1~16.04.4 [1,426 kB]
Get:3 http://se.archive.ubuntu.com/ubuntu xenial/main amd64 g++ amd64 4:5.3.1-1ubuntu1 [1,504 B]
Get:4 http://se.archive.ubuntu.com/ubuntu xenial/main amd64 build-essential amd64 12.1ubuntu2 [4,758 B]
Get:6 http://se.archive.ubuntu.com/ubuntu xenial/main amd64 libpython3-dev amd64 3.5.1-3 [6,926 B]
Get:7 http://se.archive.ubuntu.com/ubuntu xenial-updates/main amd64 python3.5-dev amd64 3.5.2-2ubuntu0~16.04.1 [413 kB]
Get:8 http://se.archive.ubuntu.com/ubuntu xenial/main amd64 python3-dev amd64 3.5.1-3 [1,186 B]
Get:9 http://se.archive.ubuntu.com/ubuntu xenial-updates/universe amd64 python3-pip all 8.1.1-2ubuntu0.4 [109 kB]
Get:10 http://se.archive.ubuntu.com/ubuntu xenial/main amd64 python3-setuptools all 20.7.0-1 [88.0 kB]
Get:11 http://se.archive.ubuntu.com/ubuntu xenial/universe amd64 python3-wheel all 0.29.0-1 [48.1 kB]         
Get:2 http://gensho.acc.umu.se/ubuntu xenial-updates/main amd64 g++-5 amd64 5.4.0-6ubuntu1~16.04.4 [8,300 kB]             
Get:5 http://caesar.acc.umu.se/ubuntu xenial-updates/main amd64 libpython3.5-dev amd64 3.5.2-2ubuntu0~16.04.1 [37.3 MB]
Fetched 47.7 MB in 13s (3,619 kB/s)                                                                                       
Selecting previously unselected package libstdc++-5-dev:amd64.
(Reading database ... 120514 files and directories currently installed.)
Preparing to unpack .../libstdc++-5-dev_5.4.0-6ubuntu1~16.04.4_amd64.deb ...
Unpacking libstdc++-5-dev:amd64 (5.4.0-6ubuntu1~16.04.4) ...
Selecting previously unselected package g++-5.
Preparing to unpack .../g++-5_5.4.0-6ubuntu1~16.04.4_amd64.deb ...
Unpacking g++-5 (5.4.0-6ubuntu1~16.04.4) ...
Selecting previously unselected package g++.
Preparing to unpack .../g++_4%3a5.3.1-1ubuntu1_amd64.deb ...
Unpacking g++ (4:5.3.1-1ubuntu1) ...
Selecting previously unselected package build-essential.
Preparing to unpack .../build-essential_12.1ubuntu2_amd64.deb ...
Unpacking build-essential (12.1ubuntu2) ...
Selecting previously unselected package libpython3.5-dev:amd64.
Preparing to unpack .../libpython3.5-dev_3.5.2-2ubuntu0~16.04.1_amd64.deb ...
Unpacking libpython3.5-dev:amd64 (3.5.2-2ubuntu0~16.04.1) ...
Selecting previously unselected package libpython3-dev:amd64.
Preparing to unpack .../libpython3-dev_3.5.1-3_amd64.deb ...
Unpacking libpython3-dev:amd64 (3.5.1-3) ...
Selecting previously unselected package python3.5-dev.
Preparing to unpack .../python3.5-dev_3.5.2-2ubuntu0~16.04.1_amd64.deb ...
Unpacking python3.5-dev (3.5.2-2ubuntu0~16.04.1) ...
Selecting previously unselected package python3-dev.
Preparing to unpack .../python3-dev_3.5.1-3_amd64.deb ...
Unpacking python3-dev (3.5.1-3) ...
Selecting previously unselected package python3-pip.
Preparing to unpack .../python3-pip_8.1.1-2ubuntu0.4_all.deb ...
Unpacking python3-pip (8.1.1-2ubuntu0.4) ...
Selecting previously unselected package python3-setuptools.
Preparing to unpack .../python3-setuptools_20.7.0-1_all.deb ...
Unpacking python3-setuptools (20.7.0-1) ...
Selecting previously unselected package python3-wheel.
Preparing to unpack .../python3-wheel_0.29.0-1_all.deb ...
Unpacking python3-wheel (0.29.0-1) ...
Processing triggers for man-db (2.7.5-1) ...
Setting up libstdc++-5-dev:amd64 (5.4.0-6ubuntu1~16.04.4) ...
Setting up g++-5 (5.4.0-6ubuntu1~16.04.4) ...
Setting up g++ (4:5.3.1-1ubuntu1) ...
update-alternatives: using /usr/bin/g++ to provide /usr/bin/c++ (c++) in auto mode
Setting up build-essential (12.1ubuntu2) ...
Setting up libpython3.5-dev:amd64 (3.5.2-2ubuntu0~16.04.1) ...
Setting up libpython3-dev:amd64 (3.5.1-3) ...
Setting up python3.5-dev (3.5.2-2ubuntu0~16.04.1) ...
Setting up python3-dev (3.5.1-3) ...
Setting up python3-pip (8.1.1-2ubuntu0.4) ...
Setting up python3-setuptools (20.7.0-1) ...
Setting up python3-wheel (0.29.0-1) ...
ocadmin@owncloud:~$ sudo pip3 install --upgrade pip
The directory '/home/ocadmin/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
The directory '/home/ocadmin/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
Collecting pip
  Downloading pip-9.0.1-py2.py3-none-any.whl (1.3MB)
    100% |████████████████████████████████| 1.3MB 712kB/s 
Installing collected packages: pip
  Found existing installation: pip 8.1.1
    Not uninstalling pip at /usr/lib/python3/dist-packages, outside environment /usr
Successfully installed pip-9.0.1
ocadmin@owncloud:~$ sudo apt-get install libffi-dev
Reading package lists... Done
Building dependency tree       
Reading state information... Done
libffi-dev is already the newest version (3.2.1-4).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
ocadmin@owncloud:~$ sudo pip3 install ocrmypdf
The directory '/home/ocadmin/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
The directory '/home/ocadmin/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
Collecting ocrmypdf
  Downloading ocrmypdf-4.3.4-py34-none-any.whl (48kB)
    100% |████████████████████████████████| 51kB 561kB/s 
Collecting cffi>=1.5.0 (from ocrmypdf)
  Downloading cffi-1.9.1-cp35-cp35m-manylinux1_x86_64.whl (398kB)
    100% |████████████████████████████████| 399kB 1.8MB/s 
Collecting PyPDF2>=1.26 (from ocrmypdf)
  Downloading PyPDF2-1.26.0.tar.gz (77kB)
    100% |████████████████████████████████| 81kB 4.6MB/s 
Collecting ruffus==2.6.3 (from ocrmypdf)
  Downloading ruffus-2.6.3.tar.gz (36.9MB)
    100% |████████████████████████████████| 36.9MB 26kB/s 
Collecting img2pdf>=0.2.1 (from ocrmypdf)
  Downloading img2pdf-0.2.1.tar.gz (46kB)
    100% |████████████████████████████████| 51kB 3.8MB/s 
Collecting Pillow>=3.1.0 (from ocrmypdf)
  Downloading Pillow-3.4.2-cp35-cp35m-manylinux1_x86_64.whl (5.6MB)
    100% |████████████████████████████████| 5.6MB 181kB/s 
Collecting reportlab>=3.2.0 (from ocrmypdf)
  Downloading reportlab-3.3.0.tar.gz (2.0MB)
    100% |████████████████████████████████| 2.0MB 503kB/s 
Collecting pycparser (from cffi>=1.5.0->ocrmypdf)
  Downloading pycparser-2.17.tar.gz (231kB)
    100% |████████████████████████████████| 235kB 362kB/s 
Requirement already satisfied: pip>=1.4.1 in /usr/local/lib/python3.5/dist-packages (from reportlab>=3.2.0->ocrmypdf)
Requirement already satisfied: setuptools>=2.2 in /usr/lib/python3/dist-packages (from reportlab>=3.2.0->ocrmypdf)
Installing collected packages: pycparser, cffi, PyPDF2, ruffus, Pillow, img2pdf, reportlab, ocrmypdf
  Running setup.py install for pycparser ... done
  Running setup.py install for PyPDF2 ... done
  Running setup.py install for ruffus ... done
  Running setup.py install for img2pdf ... done
  Running setup.py install for reportlab ... done
Successfully installed Pillow-3.4.2 PyPDF2-1.26.0 cffi-1.9.1 img2pdf-0.2.1 ocrmypdf-4.3.4 pycparser-2.17 reportlab-3.3.0 ruffus-2.6.3
ocadmin@owncloud:~$ sudo apt-get install tesseract-ocr tesseract-ocr-deu
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following additional packages will be installed:
  libdatrie1 liblept5 libopenjp2-7 libpango-1.0-0 libpangocairo-1.0-0 libpangoft2-1.0-0 libtesseract3 libthai-data
  libthai0 libwebp5 tesseract-ocr-eng tesseract-ocr-equ tesseract-ocr-osd
The following NEW packages will be installed:
  libdatrie1 liblept5 libopenjp2-7 libpango-1.0-0 libpangocairo-1.0-0 libpangoft2-1.0-0 libtesseract3 libthai-data
  libthai0 libwebp5 tesseract-ocr tesseract-ocr-deu tesseract-ocr-eng tesseract-ocr-equ tesseract-ocr-osd
0 upgraded, 15 newly installed, 0 to remove and 0 not upgraded.
Need to get 19.3 MB of archives.
After this operation, 73.3 MB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 http://se.archive.ubuntu.com/ubuntu xenial/main amd64 libdatrie1 amd64 0.2.10-2 [17.3 kB]
Get:2 http://se.archive.ubuntu.com/ubuntu xenial-updates/universe amd64 libopenjp2-7 amd64 2.1.0-2.1ubuntu0.1 [103 kB]
Get:3 http://se.archive.ubuntu.com/ubuntu xenial/main amd64 libwebp5 amd64 0.4.4-1 [165 kB]
Get:4 http://se.archive.ubuntu.com/ubuntu xenial/universe amd64 liblept5 amd64 1.73-1 [872 kB]
Get:5 http://se.archive.ubuntu.com/ubuntu xenial/main amd64 libthai-data all 0.1.24-2 [131 kB]
Get:6 http://se.archive.ubuntu.com/ubuntu xenial/main amd64 libthai0 amd64 0.1.24-2 [17.3 kB]
Get:7 http://se.archive.ubuntu.com/ubuntu xenial/main amd64 libpango-1.0-0 amd64 1.38.1-1 [148 kB]
Get:8 http://se.archive.ubuntu.com/ubuntu xenial/main amd64 libpangoft2-1.0-0 amd64 1.38.1-1 [33.3 kB]
Get:9 http://se.archive.ubuntu.com/ubuntu xenial/main amd64 libpangocairo-1.0-0 amd64 1.38.1-1 [20.5 kB]
Get:10 http://se.archive.ubuntu.com/ubuntu xenial/universe amd64 libtesseract3 amd64 3.04.01-4 [1,106 kB]
Get:12 http://se.archive.ubuntu.com/ubuntu xenial/universe amd64 tesseract-ocr-osd all 3.04.00-1 [2,988 kB]
Get:11 http://gensho.acc.umu.se/ubuntu xenial/universe amd64 tesseract-ocr-eng all 3.04.00-1 [8,824 kB]
Get:13 http://se.archive.ubuntu.com/ubuntu xenial/universe amd64 tesseract-ocr-equ all 3.04.00-1 [568 kB]
Get:14 http://se.archive.ubuntu.com/ubuntu xenial/universe amd64 tesseract-ocr amd64 3.04.01-4 [132 kB]
Get:15 http://se.archive.ubuntu.com/ubuntu xenial/universe amd64 tesseract-ocr-deu all 3.04.00-1 [4,153 kB]
Fetched 19.3 MB in 6s (3,137 kB/s)                                                                                        
Selecting previously unselected package libdatrie1:amd64.
(Reading database ... 121649 files and directories currently installed.)
Preparing to unpack .../libdatrie1_0.2.10-2_amd64.deb ...
Unpacking libdatrie1:amd64 (0.2.10-2) ...
Selecting previously unselected package libopenjp2-7:amd64.
Preparing to unpack .../libopenjp2-7_2.1.0-2.1ubuntu0.1_amd64.deb ...
Unpacking libopenjp2-7:amd64 (2.1.0-2.1ubuntu0.1) ...
Selecting previously unselected package libwebp5:amd64.
Preparing to unpack .../libwebp5_0.4.4-1_amd64.deb ...
Unpacking libwebp5:amd64 (0.4.4-1) ...
Selecting previously unselected package liblept5.
Preparing to unpack .../liblept5_1.73-1_amd64.deb ...
Unpacking liblept5 (1.73-1) ...
Selecting previously unselected package libthai-data.
Preparing to unpack .../libthai-data_0.1.24-2_all.deb ...
Unpacking libthai-data (0.1.24-2) ...
Selecting previously unselected package libthai0:amd64.
Preparing to unpack .../libthai0_0.1.24-2_amd64.deb ...
Unpacking libthai0:amd64 (0.1.24-2) ...
Selecting previously unselected package libpango-1.0-0:amd64.
Preparing to unpack .../libpango-1.0-0_1.38.1-1_amd64.deb ...
Unpacking libpango-1.0-0:amd64 (1.38.1-1) ...
Selecting previously unselected package libpangoft2-1.0-0:amd64.
Preparing to unpack .../libpangoft2-1.0-0_1.38.1-1_amd64.deb ...
Unpacking libpangoft2-1.0-0:amd64 (1.38.1-1) ...
Selecting previously unselected package libpangocairo-1.0-0:amd64.
Preparing to unpack .../libpangocairo-1.0-0_1.38.1-1_amd64.deb ...
Unpacking libpangocairo-1.0-0:amd64 (1.38.1-1) ...
Selecting previously unselected package libtesseract3.
Preparing to unpack .../libtesseract3_3.04.01-4_amd64.deb ...
Unpacking libtesseract3 (3.04.01-4) ...
Selecting previously unselected package tesseract-ocr-eng.
Preparing to unpack .../tesseract-ocr-eng_3.04.00-1_all.deb ...
Unpacking tesseract-ocr-eng (3.04.00-1) ...
Selecting previously unselected package tesseract-ocr-osd.
Preparing to unpack .../tesseract-ocr-osd_3.04.00-1_all.deb ...
Unpacking tesseract-ocr-osd (3.04.00-1) ...
Selecting previously unselected package tesseract-ocr-equ.
Preparing to unpack .../tesseract-ocr-equ_3.04.00-1_all.deb ...
Unpacking tesseract-ocr-equ (3.04.00-1) ...
Selecting previously unselected package tesseract-ocr.
Preparing to unpack .../tesseract-ocr_3.04.01-4_amd64.deb ...
Unpacking tesseract-ocr (3.04.01-4) ...
Selecting previously unselected package tesseract-ocr-deu.
Preparing to unpack .../tesseract-ocr-deu_3.04.00-1_all.deb ...
Unpacking tesseract-ocr-deu (3.04.00-1) ...
Processing triggers for libc-bin (2.23-0ubuntu5) ...
Processing triggers for man-db (2.7.5-1) ...
Setting up libdatrie1:amd64 (0.2.10-2) ...
Setting up libopenjp2-7:amd64 (2.1.0-2.1ubuntu0.1) ...
Setting up libwebp5:amd64 (0.4.4-1) ...
Setting up liblept5 (1.73-1) ...
Setting up libthai-data (0.1.24-2) ...
Setting up libthai0:amd64 (0.1.24-2) ...
Setting up libpango-1.0-0:amd64 (1.38.1-1) ...
Setting up libpangoft2-1.0-0:amd64 (1.38.1-1) ...
Setting up libpangocairo-1.0-0:amd64 (1.38.1-1) ...
Setting up libtesseract3 (3.04.01-4) ...
Setting up tesseract-ocr-eng (3.04.00-1) ...
Setting up tesseract-ocr-osd (3.04.00-1) ...
Setting up tesseract-ocr-equ (3.04.00-1) ...
Setting up tesseract-ocr (3.04.01-4) ...
Setting up tesseract-ocr-deu (3.04.00-1) ...
Processing triggers for libc-bin (2.23-0ubuntu5) ...
ocadmin@owncloud:~$ sudo apt-get install tesseract-ocr-spa
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  tesseract-ocr-spa
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 6,688 kB of archives.
After this operation, 39.2 MB of additional disk space will be used.
Get:1 http://caesar.acc.umu.se/ubuntu xenial/universe amd64 tesseract-ocr-spa all 3.04.00-1 [6,688 kB]
Fetched 6,688 kB in 2s (3,143 kB/s)            
Selecting previously unselected package tesseract-ocr-spa.
(Reading database ... 121790 files and directories currently installed.)
Preparing to unpack .../tesseract-ocr-spa_3.04.00-1_all.deb ...
Unpacking tesseract-ocr-spa (3.04.00-1) ...
Setting up tesseract-ocr-spa (3.04.00-1) ...
ocadmin@owncloud:~$ sudo apt-get install tesseract-ocr-por
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  tesseract-ocr-por
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 3,893 kB of archives.
After this operation, 12.9 MB of additional disk space will be used.
Get:1 http://se.archive.ubuntu.com/ubuntu xenial/universe amd64 tesseract-ocr-por all 3.04.00-1 [3,893 kB]
Fetched 3,893 kB in 1s (3,273 kB/s)            
Selecting previously unselected package tesseract-ocr-por.
(Reading database ... 121801 files and directories currently installed.)
Preparing to unpack .../tesseract-ocr-por_3.04.00-1_all.deb ...
Unpacking tesseract-ocr-por (3.04.00-1) ...
Setting up tesseract-ocr-por (3.04.00-1) ...
ocadmin@owncloud:~$ sudo apt-get install tesseract-ocr-ndl
Reading package lists... Done
Building dependency tree       
Reading state information... Done
E: Unable to locate package tesseract-ocr-ndl
ocadmin@owncloud:~$ sudo apt-get install tesseract-ocr-ita
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  tesseract-ocr-ita
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 5,845 kB of archives.
After this operation, 32.8 MB of additional disk space will be used.
Get:1 http://saimei.acc.umu.se/ubuntu xenial/universe amd64 tesseract-ocr-ita all 3.04.00-1 [5,845 kB]
Fetched 5,845 kB in 16s (357 kB/s)             
Selecting previously unselected package tesseract-ocr-ita.
(Reading database ... 121805 files and directories currently installed.)
Preparing to unpack .../tesseract-ocr-ita_3.04.00-1_all.deb ...
Unpacking tesseract-ocr-ita (3.04.00-1) ...
Setting up tesseract-ocr-ita (3.04.00-1) ...
ocadmin@owncloud:~$ sudo apt-get install tesseract-ocr-fra
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  tesseract-ocr-fra
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 6,075 kB of archives.
After this operation, 37.4 MB of additional disk space will be used.
Get:1 http://caesar.acc.umu.se/ubuntu xenial/universe amd64 tesseract-ocr-fra all 3.04.00-1 [6,075 kB]
Fetched 6,075 kB in 1s (3,163 kB/s)            
Selecting previously unselected package tesseract-ocr-fra.
(Reading database ... 121817 files and directories currently installed.)
Preparing to unpack .../tesseract-ocr-fra_3.04.00-1_all.deb ...
Unpacking tesseract-ocr-fra (3.04.00-1) ...
Setting up tesseract-ocr-fra (3.04.00-1) ...
ocadmin@owncloud:~$ sudo apt-get install tesseract-ocr-eng
Reading package lists... Done
Building dependency tree       
Reading state information... Done
tesseract-ocr-eng is already the newest version (3.04.00-1).
tesseract-ocr-eng set to manually installed.
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
ocadmin@owncloud:~$ sudo apt-get install tesseract-ocr-deu-frak
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  tesseract-ocr-deu-frak
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 616 kB of archives.
After this operation, 2,013 kB of additional disk space will be used.
Get:1 http://se.archive.ubuntu.com/ubuntu xenial/universe amd64 tesseract-ocr-deu-frak all 3.04.00-1 [616 kB]
Fetched 616 kB in 0s (1,016 kB/s)              
Selecting previously unselected package tesseract-ocr-deu-frak.
(Reading database ... 121829 files and directories currently installed.)
Preparing to unpack .../tesseract-ocr-deu-frak_3.04.00-1_all.deb ...
Unpacking tesseract-ocr-deu-frak (3.04.00-1) ...
Setting up tesseract-ocr-deu-frak (3.04.00-1) ...

Travis.yml update for mysql

update the travis.yml for mysql. first the mysql service has to be available in the trusty beta of travis.

[Feature request]: Option to replace file during processing

Hey guys,
first of all. The nextcloud ocr plugin is great and works fine. Thank you for your work.

Did you think about an option to replace the original file during ocr process? Every time I do this I delete the old file and rename the new one.

Possible solution:
I would appreciate an option in the plugin settings or somewhere else to automatically replace a file when processing ocr. In case of an error or bad result e.g. I would be able to restore the orginal file via NC history feature.

View bug

Bug report

Expected Behavior

After a file was selected and the delete action is clicked (in the top action bar). The file action bar gets hidden and the file information sorting is available again.

Current Behavior

The ocr icon and menu option is still available and does not get disabled.

Possible Solution

maybe hide it after the events of the other actions fired?

Queue processing for multiple files

I have a queueing solution in mind (3rd party) which could allow to cue the tesseract processing for multiple files.
It could run in background and use the Webdav to upload it to the owncloud.
Feedback for the status should be available. (Websockets maybe)

Display problems in "Apps" view in Nextcloud 10

Bug report

Current Behavior

After extracting the current master version of OCR to nextcloud/apps/ocr/ on my Nextcloud 10.0.1 installation, it appears in the Apps section under "Not activated" like this:

[object Object],[object Object] 1.0.0
von Janis Koehr (agpl-lizensiert)

I can enable and use OCR just fine, but as soon as it is enabled, the "Activated" page of the Apps view does not load anymore (endless loading animation).

After disabling the OCR app on the command line (with occ disable:app ocr), the "Activated" page works again.

I took a quick look at the OCR app code, but couldn't find the reason for this behaviour myself.

Your Environment

  • OCR version used: current master checkout
  • Browser Name and version: Safari 10 on macOS
  • Operating System and version (desktop or mobile): macOS Sierra
  • ownCloud/nextcloud version: (see ownCloud admin page or version.php) 10.0.1
  • PHP version: 7.0.8-0ubuntu0.16.04.3
  • Database version: 10.0.27-MariaDB-0ubuntu0.16.04.1
  • Are you using encryption: no

Log File Content (nextcloud/owncloud.log of the "data"-directory)

Nothing that looks like it has to do with this issue.

Fix Scrutinizer code coverage awaits

Scrutinizer awaits the code coverage very long and runs into a timeout. Maybe change the behaviour of travis once again in order to get this right.

Also: change the timeout time to 10 minutes.

supervisor example needs process name

I probably should have added this to the other issue (oops).

I forget the exact error but since you are specifying multiple processes you have to specify process_name This is what worked for me: process_name = %(program_name)s_%(process_num)02d

And this is the output of servervisorctl status:
myworker:myworker_00 RUNNING pid 2223, uptime 0:12:02
myworker:myworker_01 RUNNING pid 2240, uptime 0:12:02
myworker:myworker_02 RUNNING pid 2241, uptime 0:12:02

OCR Settings blank

Hello,

on Nextcloud 12 i get this at th personal Setttings.
ocr

the Worker running with sudo -u pleskuser nohup php /var/www/vhosts/larsmueller.net/nextcloud/apps/ocr/worker/OCRWorker.php

(pleskuser is the same as www-data at an Pleskerver)

Bug report

Current Behavior

Blank OCR entry at personal Settings

Your Environment

  • OCR version used:
  • Browser Name and version: Google Chrome
  • Operating System and version (desktop or mobile): Desktop
  • ownCloud/nextcloud version: Nextcloud 12.0.0
  • PHP version 7.15
  • Database version: mysql Ver 14.14 Distrib 5.7.18, for Linux (x86_64) using EditLine wrapper
  • Are you using encryption: no

Log File Content (nextcloud/owncloud.log of the "data"-directory)

no logfile entry

OCR App could not be initialized: "No languages found."

Warning at the top says:
OCR App could not be initialized: "No languages found."

but I do have languages installed

# tesseract --list-langs
List of available languages (4):
deu
equ
osd
eng

Context

Your Environment

  • OCR version used: 2.3.0
  • ownCloud/nextcloud version: 11.0.1
  • PHP version: 7.1.1

Log File Content (nextcloud/owncloud.log of the "data"-directory)

"message": "Exception during ocr service function processing: {\"Exception\":\"OCA\\\\Ocr\\\\Service\\\\NotFoundException\",\"Message\":\"No languages found.\",\"Code\":0,\"Trace\":\"#0 \\\/customapps\\\/ocr\\\/lib\\\/Controller\\\/OcrController.php(61): OCA\\\\Ocr\\\\Service\\\\OcrService->listLanguages()\\n#1 \\\/customapps\\\/ocr\\\/lib\\\/Controller\\\/Errors.php(35): OCA\\\\Ocr\\\\Controller\\\\OcrController->OCA\\\\Ocr\\\\Controller\\\\{closure}()\\n#2 \\\/customapps\\\/ocr\\\/lib\\\/Controller\\\/OcrController.php(62): OCA\\\\Ocr\\\\Controller\\\\OcrController->handleNotFound(Object(Closure))\\n#3 [internal function]: OCA\\\\Ocr\\\\Controller\\\\OcrController->languages()\\n#4 \\\/lib\\\/private\\\/AppFramework\\\/Http\\\/Dispatcher.php(160): call_user_func_array(Array, Array)\\n#5 \\\/lib\\\/private\\\/AppFramework\\\/Http\\\/Dispatcher.php(90): OC\\\\AppFramework\\\\Http\\\\Dispatcher->executeController(Object(OCA\\\\Ocr\\\\Controller\\\\OcrController), 'languages')\\n#6 \\\/lib\\\/private\\\/AppFramework\\\/App.php(114): OC\\\\AppFramework\\\\Http\\\\Dispatcher->dispatch(Object(OCA\\\\Ocr\\\\Controller\\\\OcrController), 'languages')\\n#7 \\\/lib\\\/private\\\/AppFramework\\\/Routing\\\/RouteActionHandler.php(47): OC\\\\AppFramework\\\\App::main('OcrController', 'languages', Object(OC\\\\AppFramework\\\\DependencyInjection\\\\DIContainer), Array)\\n#8 [internal function]: OC\\\\AppFramework\\\\Routing\\\\RouteActionHandler->__invoke(Array)\\n#9 \\\/lib\\\/private\\\/Route\\\/Router.php(299): call_user_func(Object(OC\\\\AppFramework\\\\Routing\\\\RouteActionHandler), Array)\\n#10 \\\/lib\\\/base.php(1010): OC\\\\Route\\\\Router->match('\\\/apps\\\/ocr')\\n#11 \\\/index.php(40): OC::handleRequest()\\n#12 {main}\",\"File\":\"\\\/customapps\\\/ocr\\\/lib\\\/Service\\\/OcrService.php\",\"Line\":142}"

OCR text is not searchable by multiple-word phrase inside pdf viewer

Inside a pdf viewer (acrobat reader, or pdf.js in the browser), you cannot search for a phrase of multiple words. The phrase matches nothing even when it is in the document.

Bug report / Feature request

Expected Behavior

When the document contains, for example, "Breakfast menu", when you click the search icon (magnifying glass) and enter text "breakfast menu", it should match the text and find it.

Current Behavior

It olny matches one word. For example, it matches "breakfast", or it matches "menu". If you try to search for two words, it fails to find a match, even when the two words are clearly together on the same line, in the document!

Possible Solution

Possibly take a look at the parameters or settings for tesseract-ocr and see if it can be made to connect words which are on the same line, into the same continuous text line.

Steps to Reproduce (for bugs)

  1. Upload a scan of a page of text in pdf format.
  2. Run the ocr on it.
  3. Open the _OCR.pdf version of the pdf file which contains the recognized text.
  4. Click the magnifying glass, enter text for two adjacent words on the same line. Search fails to find the two words. It finds only one word at a time.

Context

Searching for only one word at a time is awkward and time consuming.

Your Environment

  • OCR version used: Latest
  • Browser Name and version: Latest firefox.
  • Operating System and version (desktop or mobile): Windows 10, Linux Debian 8.
  • ownCloud/nextcloud version: (see ownCloud admin page or version.php) Latest NC.
  • PHP version 7.0
  • Database version 5.6 mysql mariadb
  • Are you using encryption: yes/no No.

Log File Content (nextcloud/owncloud.log of the "data"-directory)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.