danpla / dpscreenocr Goto Github PK

View Code? Open in Web Editor NEW

214.0 214.0 16.0 2.58 MB

Program to recognize text on screen

Home Page: https://danpla.github.io/dpscreenocr/

License: zlib License

CMake 8.51% C++ 76.74% C 7.12% Python 4.87% Dockerfile 2.62% Shell 0.14%

ocr tesseract tesseract-ocr

dpscreenocr's People

Stargazers

Watchers

Forkers

limkokholefork sveitser zhang11wu4 wroldwiedbwe weblate lipq525 jmichaelkane ainal-3fresources rayssags jessonfoo vincentlao clickwithclark andrewwippler vpnry onemotre

dpscreenocr's Issues

Create "Add languages" button that show instructions how to add

At now users see readonly list of available languages, with no ideas how to add new language. In manual there are described ways, but not all users read it. So will be good to create "Add languages" button that show instructions how to add new languages via installing Tesseract packages.

add-apt-repository: The repository [...] does not have a Release file

$sudo add-apt-repository ppa:daniel.p/dpscreenocr

E: The repository 'https://ppa.launchpadcontent.net/daniel.p/dpscreenocr/ubuntu kinetic Release' does not have a Release file.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.

$sudo apt install  dpscreenocr
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
E: Unable to locate package dpscreenocr

No languages detected due to TESSDATA_PREFIX

Application was running previously but after

a windows update and
uninstallation of a local installation of tessaract (which was installed before dpscreenocr),

the application opens with the message, "Please install languages" . Assumed that no language file was installed or missing. Upon inspection, the file eng.traindata was present in the C:\Program Files (x86)\dpScreenOCR\tessdata directory. Downloading another language file and copying it to the same directory does not change the error message.

Removed and re-installed dpscreenocr several times did not yield different results. Uninstalling dpscreenocr does not remove the configuration file in C:\Users<user>\AppData\Local\dpscreenocr. Manually deleted the directory and re-installed dpscreenocr but still no change.
Installation onto another computer with similar windows level (version 20H2 Build 19042.1466) was successful so issue is local to the computer.
Does dpscreenOCR aware of other tesseract installation through environment variables?

Ability to freeze screen after toggle selection hotkey is pressed

Currently after hotkey is pressed screen doesn't freeze like in Text extractor from windows powertoys, where I'm able to select text without worrying that I'll miss the moment after hotkey press. Adding this feature would be convenient when i need to quickly copy some text that is gonna disappear soon from screen.

[linux] Program crashes after upgrading to newer Tesseract

When I try to translate some text on the screen, dpscreenocr is crashing with

Warning: Invalid resolution 0 dpi. Using 70 instead.
Segmentation fault

I don't know if the warning is meaningful, btw.
Any idea how to diagnose this better?
I'm running Ubuntu 18.04 and dpscreenocr 1.0.2 from the ppa.

Avast and AVG detect qwindowsvistastyle.dll and qwindows.dll as "Win32:MalOb-IJ [Cryp]"

some of the dlls are detected by avg as heavily disguised malware

Linux Wayland Support - Understanding Changes Needed

Hi, just starting to look at the code and wondering what would be needed for wayland support. If it's just around the hotkey triggering limitations I know some projects (e.g. kupfer) have just worked around needing X11KeyManager by having a command-line method of invocation paired with instructions to setup a custom shortcut (e.g. via gnome settings).

Opening an issue to discuss/understand if there's other aspects to be considered in attempting to support wayland.
(Thanks for your earlier advice, application works fine under x11)

Antivirus flags dpScreenOCR as malware

I just upgraded to 1.30 and Malwarebytes is flagging it as malware:

https://imgur.com/a/JLdKFMN

-Blocked Malware Details-
File: 1
MachineLearning/Anomalous.94%, C:\Program Files (x86)\dpScreenOCR\dpscreenocr.exe, Quarantined, 0, 392687, 1.0.63270, , shuriken, , 7A97F31B2DC40CFC17A36D015C311E30, 89CF93609749445DB72798C62731B3611525E7BAFA1872E3A9AC825720D03451

App doesn't see languages on Mint 20.2

I have this error
qt5ct: using qt5ct plugin
Error opening data file /usr/share/tesseract-ocr/4.00/tessdata/eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!

but /usr/share/tesseract-ocr/5/tessdata/eng.traineddata is the new one

error from linux mint

run as one shot tool

Hello, Thank you very much for this useful program. How about a command line option to run the command that copies the recognized text of the selected rectanglye, as triggered by the hot key, to use a key grabber such as xbindkeys instead of that of dpscreenocr?

Selection is behind newly opened windows in qtile

Hello!
I have very annoying issue with dpscreenocr. It's supper nice program, but if I open new windows, i. e. some PDF, after I launched dpscreenocr, then nothing can be done to ocr the content of new window, because dpscreenocr's box shows up behind window I need.

I use Linux, X11 with qtile DE.

Show popup window with result text

At now result of OCR operation seem placed only in history textarea, so not easy to understand that process successfully completed. Will be good to show popup window with result text after finishing recognize process.

Issues with vertical Japanese text

sorry if its stupid, I read the manual or doc whatever you call but I could not decide should I use split text block or not, im using dpscreenocr for manga reading, ty.

Enhancement Suggestions for dpScreenOCR

Dear danpla,

I hope this message finds you well. I am an avid user of dpScreenOCR, and I want to express my appreciation for the excellent work you've done on this application. I have been impressed by its speed and efficiency.

As I've been using dpScreenOCR, I've identified three features that I believe would significantly enhance the user experience:

Auto Startup: It would be immensely beneficial to have an option to enable dpScreenOCR to launch automatically upon system startup. This feature would streamline its accessibility and make it readily available whenever the computer is powered on.
Minimize to Tray: Another valuable addition would be the ability to keep dpScreenOCR minimized and in the system tray upon
launch. This would contribute to a cleaner desktop environment while ensuring quick access to the application when needed.
Adjustable Line Concatenation Option: Introducing a feature that allows users to enable or disable line concatenation upon copying text would significantly enhance the readability and organization of extracted content. When activated, this option would intelligently join lines of text, eliminating line breaks and presenting a seamless and cohesive view of the captured information.

Additionally, it seems that the "Split Text Blocks" option is not working a. Your consideration of these enhancements, including investigating the "Split Text Blocks" issue, would be greatly appreciated. I believe these improvements will contribute to making dpScreenOCR an even more versatile and user-friendly tool.

Thank you for your time and dedication to improving this superb application. I look forward to seeing dpScreenOCR evolve with these potential enhancements.

Using Tesseract 4 languages with Tesseract 5

Hi,

I've downloaded version 1.1.0, which supports Tesseract 5.2.0, but OCR languages refer to Tesseract 4.1:

https://danpla.github.io/dpscreenocr/languages

I'm particularly interested in Spanish.

Thanks in advance and thank you for your programme!

Add translate language feature via translate-shell script

Will be good to integrate translate result text into other language function, using translate-shell command-line tool.

will it work on bsds?

if no any plan to support bsd

Does not save configs on Windows

Hi, I was using it on Ubuntu without any problem but on Windows it does not save config when reboot computer. I just created shortcut of dpscreenocr.exe and copied it to C:\Users\username\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup to make dpscreenocr show up on system startup, but yeah it does not save configs.
I use Windows 7, eng ocr, by the way, What split text blocks option does?

Application failed to start on Windows

I get the following error when starting the program (ZIP-Version):

rpm package

Hello, running sudo alien --to-rpm on the latest deb package produced a perfectly working rpm file on Opensuse 15.4; maybe worth mentioning?

Negative (invert image)

Congratulations on such a promising tool! Thanks for the effort!

It seems that, sometimes, dark text/light background works better than light text/dark background (or viceversa) depending on the specific sample.

It seems interesting that there is an option to "invert" the image clip (negative) before sending it to Tesseract (checkbox or so). Is it possible?

DJuego

Does not recognize numbers

When using dpScreenOCR, it does not recocgnize numbers. dpScreenOCR should or could include recocgnition of numbers when numbers apperas inside a text or for the case in which the whole ocred image is a number.

Reading QR-Codes

Maybe you can add the possibility to read qr codes?

How should it work?

I have a QR-code on my Screen. For example this one:
I start dpScreenOCR
I click Alt+Q to catch the QR-Code
I find now the web-link behind the QR-code in my clipboard

What do you think? Thank you for your feedback.

Wont show up on Windows 10

It wont show up. I already placed the language in the correct folder. English language.

I can't set Capslock as cancel selection

I tried to add these to config file Caps, Capslock, Caps-Lock, CapsLock but it does not work after restarting dpscreenocr hotkey_cancel_selection is being empty

Silent autostart (Linux Mint)

At the moment, it seems that's not possible to start dpscreenocr without having its window opened.
I don't need to use the window, I'd just want to use the ocr screenshot reading feature through the setted keyboard shortcut.

Also, and maybe obviously, I don't want to start the app manually, and so I'd be great if there was a "autostart at system startup" feature (with, like said before, no window opened and then showed).

Unwrapping paragraphs

Thank you for creating this cool little app. But there is someting important I really miss.

When I try to use dpScreenOCR with this picture

Then I get this output:

Eh bien, je m’entraîne beaur-
coup, je me prépare pour les
championnats suisses, après je
voudrais partir pour les Etats-
Unis. Mais ma mère préfère
rester en Europe.

But what I would like to have is this:
Eh bien, je m’entraîne beaurcoup, je me prépare pour les championnats suisses, après je voudrais partir pour les Etats-Unis. Mais ma mère préfère rester en Europe.

What is the difference?
A) I prefer to have a mode that produces no line breaks.
B) Separate words like "beaur-
coup" at hte end of a line should be written together: "beaurcoup".
C) Hyphenated words in which the hyphen is not used to separate (like "Etats-Unis"): The hyphen should be retained here.

If you can not implement C) it would be good inough to have A) and B) .

What do you think?

I've downloaded this for Ubuntu but can't get it to run

I'm new to Ubuntu and have no idea where the download is saved or how to get the application to run. Can you please give some further guidance?

Tray icon mode?

I really love the program and am using it more than I thought I'd. Thank you!

However, I had a feature request: run in the background (preferably as a tray icon).

Why do I not get version 1.3.0 via Repo update?

I am a bit confused.

According to https://github.com/danpla/dpscreenocr/releases/tag/v1.3.0, version 1.3.0 is the latest version.

I have added the danpla Repo according to https://software.opensuse.org//download.html?project=home%3Adanpla&package=dpscreenocr:
With ...

echo 'deb http://download.opensuse.org/repositories/home:/danpla/Debian_10/ /' | sudo tee /etc/apt/sources.list.d/home:danpla.list
curl -fsSL https://download.opensuse.org/repositories/home:danpla/Debian_10/Release.key | gpg --dearmor | sudo tee /etc/apt/trusted.gpg.d/home_danpla.gpg > /dev/null
sudo apt update
sudo apt install dpscreenocr

... some moth ago. This has worked.

But when I make this:

$ sudo apt install dpscreenocr
Package lists are read... Ready
Dependency tree is built.       
Status information is read.... Done
dpscreenocr is already the latest version (1.2.0-1).
0 updated, 0 reinstalled, 0 to remove and 0 not updated.

My expectation was, that I will get version 1.3.0

What do I miss?

Would appreciate some answer. Thank you.

Languages are not displayed after upgrading Ubuntu to 22.10 or later

~ $  apt list --installed 'tesseract-ocr-*'                             15:12:52
Listing... Done
tesseract-ocr-eng/lunar,lunar,now 1:4.1.0-2 all [installed]
tesseract-ocr-osd/lunar,lunar,now 1:4.1.0-2 all [installed,automatic]
tesseract-ocr-spa/lunar,lunar,now 1:4.1.0-2 all [installed]

Comparison with other OCR tools

Can you briefly explain the difference between the dpScreenOCR tool, Microsoft's Text Extractor utility from PowerToys, and the OCR tool from the open-source ShareX package?

Actually, it would be ideal to write inside README.md file a brief comparison between these three tools, or a comparison table with those tools features.

Thanks

Slow recognition due to multithreading issues in Tesseract 4 and 5

Even testing with one language it takes ten seconds to recognize. OCR'ing the same image with tesseract via the commandline takes under a second.

I have it set to only recognize one language in both case, so am not sure what's causing the delay in dpscreenocr.

Does dpscreenocr do postprocessing

I'm trying to find ways to improve OCR quality so I have seen some OCR postprocess tweaking with imagemagick so I wonder do dpscreenocr that already?

Cannot install english language on Ubuntu 20.04

After installation using the PPA, I installed tesseract-ocr-eng, but it wasn't recognized. So I reinstalled the app. Still does not recognize language.

This is very bad UX, and no examples for English packages are given in the manual.

I'm on Ubuntu 20.04

Problems with oblique text

Due to the nature of the program, it cannot detect oblique text or words, even if it detects it, it detects it incorrectly, an angled area selection can be beautiful.

Screenshots please

Please provide a screenshot or a video link to help users better understand how the tool works, before the install app.

I have found a video, although not very well recorded: https://www.youtube.com/watch?v=DCbfp0zMJCk

... but several screenshots from the application would be much better.

Scoop Package Manager

Maybe you could "integrate" it somehow in the Scoop package manager for Windows, for example.

In order to easily and quickly test the application, adding a Scoop manifest JSON file might be a good idea,
then I could install it quickly with:

scoop install https://raw.githubusercontent.com/danpla/dpscreenocr/master/dpscreenocr.json

or something similar.

For now, that's enough and too much from me...
Thanks!

Minimize to system tray option

Great program, good job.
Is it possible to add an option to minimize the program to system tray.
I use Kubuntu 18.04.

Start minimized to tray

Hello,
Please allow me a suggestion. Would be nice an option to start the program minimized in the system tray. In my case the program starts when windows starts and I use it when I need it with a hotkey, I can minimize it every time in the system tray using some other program for this purpose, but it has to be done every time the session starts. An option for minimize itself to the system tray on startup would be wonderful.

A greeting. and congratulations for this fantastic program.

Crash/BSOD on Windows 11 when using Hebrew

Problem:

dpscreenocr uses Tesseract to recognize text on the screen. However, Tesseract has known issues with recognizing text from images captured by Intel UHD 600 graphics cards (like BSOD).

Solution:

Add support for Tesseract-OCR-DNN to dpscreenocr. Tesseract-OCR-DNN is a deep learning-based OCR engine that is not affected by the same issues as Tesseract.

Benefits:

Improved accuracy for recognizing text from images captured by Intel UHD600 graphics cards.
Increased flexibility for users to choose the best OCR engine for their needs.
Implementation:

Add a new configuration option to dpscreenocr that allows users to choose between using Tesseract or Tesseract-OCR-DNN.
Update the code to use Tesseract-OCR-DNN when the new configuration option is set to "Tesseract-OCR-DNN".

Conclusion:

Adding support for Tesseract-OCR-DNN to dpscreenocr would be a valuable addition to the program. It would provide users with improved accuracy and flexibility, making it a more powerful tool for recognizing text on the screen.

danpla / dpscreenocr Goto Github PK

dpscreenocr's People

Stargazers

Watchers

Forkers

dpscreenocr's Issues

Recommend Projects

Recommend Topics

Recommend Org