writecrow / ocr2text Goto Github PK
View Code? Open in Web Editor NEWConvert a PDF via OCR to a TXT file in UTF-8 encoding
License: MIT License
Convert a PDF via OCR to a TXT file in UTF-8 encoding
License: MIT License
Here's the output
pip install --user --requirement requirements.txt
Collecting Pillow==6.2.0
Using cached Pillow-6.2.0.tar.gz (37.4 MB)
Collecting pdf2image==1.9.0
Using cached pdf2image-1.9.0.tar.gz (7.4 kB)
Collecting pytesseract==0.2.7
Using cached pytesseract-0.2.7.tar.gz (169 kB)
Building wheels for collected packages: Pillow, pdf2image, pytesseract
Building wheel for Pillow (setup.py) ... error
ERROR: Command errored out with exit status 1:
command: 'c:\python38\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\Brian\\AppData\\Local\\Temp\\pip-install-xja2zcqc\\Pillow\\setup.py'"'"'; __file__='"'"'C:\\Users\\Brian\\AppData\\Local\\Temp\\pip-install-xja2zcqc\\Pillow\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d 'C:\Users\Brian\AppData\Local\Temp\pip-wheel-085v4_1w'
cwd: C:\Users\Brian\AppData\Local\Temp\pip-install-xja2zcqc\Pillow\
Complete output (174 lines):
C:\Users\Brian\AppData\Local\Temp\pip-install-xja2zcqc\Pillow\setup.py:28: RuntimeWarning: Pillow does not yet support Python 3.8 and does not yet provide prebuilt Windows binaries. We do not recommend building from source on Windows.
warnings.warn(
running bdist_wheel
running build
running build_py
creating build
creating build\lib.win-amd64-3.8
creating build\lib.win-amd64-3.8\PIL
copying src\PIL\BdfFontFile.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\BlpImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\BmpImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\BufrStubImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ContainerIO.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\CurImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\DcxImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\DdsImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\EpsImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ExifTags.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\features.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\FitsStubImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\FliImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\FontFile.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\FpxImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\FtexImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\GbrImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\GdImageFile.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\GifImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\GimpGradientFile.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\GimpPaletteFile.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\GribStubImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\Hdf5StubImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\IcnsImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\IcoImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\Image.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageChops.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageCms.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageColor.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageDraw.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageDraw2.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageEnhance.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageFile.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageFilter.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageFont.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageGrab.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageMath.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageMode.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageMorph.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageOps.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImagePalette.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImagePath.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageQt.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageSequence.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageShow.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageStat.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageTk.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageTransform.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageWin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImtImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\IptcImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\Jpeg2KImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\JpegImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\JpegPresets.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\McIdasImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\MicImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\MpegImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\MpoImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\MspImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PaletteFile.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PalmImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PcdImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PcfFontFile.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PcxImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PdfImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PdfParser.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PixarImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PngImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PpmImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PsdImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PSDraw.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PyAccess.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\SgiImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\SpiderImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\SunImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\TarIO.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\TgaImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\TiffImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\TiffTags.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\WalImageFile.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\WebPImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\WmfImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\XbmImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\XpmImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\XVThumbImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\_binary.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\_tkinter_finder.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\_util.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\_version.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\__init__.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\__main__.py -> build\lib.win-amd64-3.8\PIL
running egg_info
writing src\Pillow.egg-info\PKG-INFO
writing dependency_links to src\Pillow.egg-info\dependency_links.txt
writing top-level names to src\Pillow.egg-info\top_level.txt
reading manifest file 'src\Pillow.egg-info\SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no files found matching '*.c'
warning: no files found matching '*.h'
warning: no files found matching '*.sh'
warning: no previously-included files found matching '.appveyor.yml'
warning: no previously-included files found matching '.coveragerc'
warning: no previously-included files found matching '.codecov.yml'
warning: no previously-included files found matching '.editorconfig'
warning: no previously-included files found matching '.readthedocs.yml'
warning: no previously-included files found matching 'azure-pipelines.yml'
warning: no previously-included files matching '.git*' found anywhere in distribution
warning: no previously-included files matching '*.pyc' found anywhere in distribution
warning: no previously-included files matching '*.so' found anywhere in distribution
no previously-included directories found matching '.azure-pipelines'
no previously-included directories found matching '.travis'
writing manifest file 'src\Pillow.egg-info\SOURCES.txt'
running build_ext
The headers or library files could not be found for zlib,
a required dependency when compiling Pillow from source.
Please see the install instructions at:
https://pillow.readthedocs.io/en/latest/installation.html
Traceback (most recent call last):
File "C:\Users\Brian\AppData\Local\Temp\pip-install-xja2zcqc\Pillow\setup.py", line 852, in <module>
setup(
File "c:\python38\lib\site-packages\setuptools\__init__.py", line 145, in setup
return distutils.core.setup(**attrs)
File "c:\python38\lib\distutils\core.py", line 148, in setup
dist.run_commands()
File "c:\python38\lib\distutils\dist.py", line 966, in run_commands
self.run_command(cmd)
File "c:\python38\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "C:\Users\Brian\AppData\Roaming\Python\Python38\site-packages\wheel\bdist_wheel.py", line 223, in run
self.run_command('build')
File "c:\python38\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "c:\python38\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "c:\python38\lib\distutils\command\build.py", line 135, in run
self.run_command(cmd_name)
File "c:\python38\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "c:\python38\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "c:\python38\lib\distutils\command\build_ext.py", line 340, in run
self.build_extensions()
File "C:\Users\Brian\AppData\Local\Temp\pip-install-xja2zcqc\Pillow\setup.py", line 687, in build_extensions
raise RequiredDependencyException(f)
__main__.RequiredDependencyException: zlib
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\Brian\AppData\Local\Temp\pip-install-xja2zcqc\Pillow\setup.py", line 902, in <module>
raise RequiredDependencyException(msg)
__main__.RequiredDependencyException:
The headers or library files could not be found for zlib,
a required dependency when compiling Pillow from source.
Please see the install instructions at:
https://pillow.readthedocs.io/en/latest/installation.html
----------------------------------------
ERROR: Failed building wheel for Pillow
Running setup.py clean for Pillow
Building wheel for pdf2image (setup.py) ... done
Created wheel for pdf2image: filename=pdf2image-1.9.0-py3-none-any.whl size=8095 sha256=a7fa922f68dd44e2806eeb90bfdc20cd5c082b8311a68208c20b3f68d41df5ba
Stored in directory: c:\users\brian\appdata\local\pip\cache\wheels\a2\1e\33\39929600b164d682492debff5e610bd97faa7d9eb400b1bdbd
Building wheel for pytesseract (setup.py) ... done
Created wheel for pytesseract: filename=pytesseract-0.2.7-py2.py3-none-any.whl size=165741 sha256=504ee36854dcafde80c7401d203724b319ca1c0bee30bbcf42a6a0a01f2bcc14
Stored in directory: c:\users\brian\appdata\local\pip\cache\wheels\58\c2\2a\c7c38950f8b3db30009e04ebd0cc000e6ae1af83f50b11c53c
Successfully built pdf2image pytesseract
Failed to build Pillow
Installing collected packages: Pillow, pdf2image, pytesseract
Running setup.py install for Pillow ... error
ERROR: Command errored out with exit status 1:
command: 'c:\python38\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\Brian\\AppData\\Local\\Temp\\pip-install-xja2zcqc\\Pillow\\setup.py'"'"'; __file__='"'"'C:\\Users\\Brian\\AppData\\Local\\Temp\\pip-install-xja2zcqc\\Pillow\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\Brian\AppData\Local\Temp\pip-record-ndd5_kmb\install-record.txt' --single-version-externally-managed --user --prefix= --compile --install-headers 'C:\Users\Brian\AppData\Roaming\Python\Python38\Include\Pillow'
cwd: C:\Users\Brian\AppData\Local\Temp\pip-install-xja2zcqc\Pillow\
Complete output (176 lines):
C:\Users\Brian\AppData\Local\Temp\pip-install-xja2zcqc\Pillow\setup.py:28: RuntimeWarning: Pillow does not yet support Python 3.8 and does not yet provide prebuilt Windows binaries. We do not recommend building from source on Windows.
warnings.warn(
running install
running build
running build_py
creating build
creating build\lib.win-amd64-3.8
creating build\lib.win-amd64-3.8\PIL
copying src\PIL\BdfFontFile.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\BlpImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\BmpImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\BufrStubImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ContainerIO.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\CurImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\DcxImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\DdsImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\EpsImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ExifTags.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\features.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\FitsStubImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\FliImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\FontFile.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\FpxImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\FtexImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\GbrImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\GdImageFile.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\GifImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\GimpGradientFile.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\GimpPaletteFile.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\GribStubImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\Hdf5StubImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\IcnsImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\IcoImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\Image.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageChops.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageCms.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageColor.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageDraw.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageDraw2.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageEnhance.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageFile.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageFilter.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageFont.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageGrab.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageMath.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageMode.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageMorph.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageOps.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImagePalette.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImagePath.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageQt.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageSequence.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageShow.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageStat.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageTk.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageTransform.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImageWin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\ImtImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\IptcImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\Jpeg2KImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\JpegImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\JpegPresets.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\McIdasImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\MicImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\MpegImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\MpoImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\MspImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PaletteFile.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PalmImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PcdImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PcfFontFile.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PcxImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PdfImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PdfParser.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PixarImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PngImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PpmImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PsdImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PSDraw.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\PyAccess.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\SgiImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\SpiderImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\SunImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\TarIO.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\TgaImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\TiffImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\TiffTags.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\WalImageFile.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\WebPImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\WmfImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\XbmImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\XpmImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\XVThumbImagePlugin.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\_binary.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\_tkinter_finder.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\_util.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\_version.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\__init__.py -> build\lib.win-amd64-3.8\PIL
copying src\PIL\__main__.py -> build\lib.win-amd64-3.8\PIL
running egg_info
writing src\Pillow.egg-info\PKG-INFO
writing dependency_links to src\Pillow.egg-info\dependency_links.txt
writing top-level names to src\Pillow.egg-info\top_level.txt
reading manifest file 'src\Pillow.egg-info\SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no files found matching '*.c'
warning: no files found matching '*.h'
warning: no files found matching '*.sh'
warning: no previously-included files found matching '.appveyor.yml'
warning: no previously-included files found matching '.coveragerc'
warning: no previously-included files found matching '.codecov.yml'
warning: no previously-included files found matching '.editorconfig'
warning: no previously-included files found matching '.readthedocs.yml'
warning: no previously-included files found matching 'azure-pipelines.yml'
warning: no previously-included files matching '.git*' found anywhere in distribution
warning: no previously-included files matching '*.pyc' found anywhere in distribution
warning: no previously-included files matching '*.so' found anywhere in distribution
no previously-included directories found matching '.azure-pipelines'
no previously-included directories found matching '.travis'
writing manifest file 'src\Pillow.egg-info\SOURCES.txt'
running build_ext
The headers or library files could not be found for zlib,
a required dependency when compiling Pillow from source.
Please see the install instructions at:
https://pillow.readthedocs.io/en/latest/installation.html
Traceback (most recent call last):
File "C:\Users\Brian\AppData\Local\Temp\pip-install-xja2zcqc\Pillow\setup.py", line 852, in <module>
setup(
File "c:\python38\lib\site-packages\setuptools\__init__.py", line 145, in setup
return distutils.core.setup(**attrs)
File "c:\python38\lib\distutils\core.py", line 148, in setup
dist.run_commands()
File "c:\python38\lib\distutils\dist.py", line 966, in run_commands
self.run_command(cmd)
File "c:\python38\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "c:\python38\lib\site-packages\setuptools\command\install.py", line 61, in run
return orig.install.run(self)
File "c:\python38\lib\distutils\command\install.py", line 545, in run
self.run_command('build')
File "c:\python38\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "c:\python38\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "c:\python38\lib\distutils\command\build.py", line 135, in run
self.run_command(cmd_name)
File "c:\python38\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "c:\python38\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "c:\python38\lib\distutils\command\build_ext.py", line 340, in run
self.build_extensions()
File "C:\Users\Brian\AppData\Local\Temp\pip-install-xja2zcqc\Pillow\setup.py", line 687, in build_extensions
raise RequiredDependencyException(f)
__main__.RequiredDependencyException: zlib
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\Brian\AppData\Local\Temp\pip-install-xja2zcqc\Pillow\setup.py", line 902, in <module>
raise RequiredDependencyException(msg)
__main__.RequiredDependencyException:
The headers or library files could not be found for zlib,
a required dependency when compiling Pillow from source.
Please see the install instructions at:
https://pillow.readthedocs.io/en/latest/installation.html
----------------------------------------
ERROR: Command errored out with exit status 1: 'c:\python38\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\Brian\\AppData\\Local\\Temp\\pip-install-xja2zcqc\\Pillow\\setup.py'"'"'; __file__='"'"'C:\\Users\\Brian\\AppData\\Local\\Temp\\pip-install-xja2zcqc\\Pillow\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\Brian\AppData\Local\Temp\pip-record-ndd5_kmb\install-record.txt' --single-version-externally-managed --user --prefix= --compile --install-headers 'C:\Users\Brian\AppData\Roaming\Python\Python38\Include\Pillow' Check the logs for full command output.
WARNING: You are using pip version 20.1; however, version 20.1.1 is available.
You should consider upgrading via the 'c:\python38\python.exe -m pip install --upgrade pip' command.
There is a well-supported port of the Tesseract OCR engine at https://tesseract.projectnaptha.com/
Combined with using the application framework https://www.electronjs.org/, this could provide a path forward for creating a library-independent build of this project -- up till now, the problem has been reliance on the Tesseract C++ library.
Running on a nested directory set with ~3000 PDFs up to 150mb in size each.
Win 10.
Python 3.9.2 (tags/v3.9.2:1a79785, Feb 19 2021, 13:44:55) [MSC v.1928 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
Latest version required libraries.
over 500gb free on D:\
5gb free on C:
Suspect C is being used for temp?
I'm going to look into this now.. but posting in case anyone has any immediate solution.
Percent: [####------------------------------------------------------------------------------------------------] 4.4090403853278985% Traceback (most recent call last):
File "D:\ocr2text-main\ocr2text.py", line 151, in
count = convert_recursive(source, destination, count)
File "D:\ocr2text-main\ocr2text.py", line 116, in convert_recursive
count = convert(source_path, output_filename, count, pdfCounter)
File "D:\ocr2text-main\ocr2text.py", line 121, in convert
text = extract_tesseract(sourcefile)
File "D:\ocr2text-main\ocr2text.py", line 90, in extract_tesseract
page_content = pytesseract.image_to_string(Image.open(page_path))
File "C:\Python39\lib\site-packages\pytesseract\pytesseract.py", line 413, in image_to_string
return {
File "C:\Python39\lib\site-packages\pytesseract\pytesseract.py", line 416, in
Output.STRING: lambda: run_and_get_output(*args),
File "C:\Python39\lib\site-packages\pytesseract\pytesseract.py", line 273, in run_and_get_output
with save(image) as (temp_name, input_filename):
File "C:\Python39\lib\contextlib.py", line 117, in enter
return next(self.gen)
File "C:\Python39\lib\site-packages\pytesseract\pytesseract.py", line 196, in save
image.save(input_file_name, format=image.format)
File "C:\Python39\lib\site-packages\PIL\Image.py", line 2212, in save
save_handler(self, fp, filename)
File "C:\Python39\lib\site-packages\PIL\PpmImagePlugin.py", line 149, in _save
ImageFile._save(im, fp, [("raw", (0, 0) + im.size, 0, (rawmode, 0, 1))])
File "C:\Python39\lib\site-packages\PIL\ImageFile.py", line 527, in _save
s = e.encode_to_file(fh, bufsize)
OSError: [Errno 28] No space left on device
My environment is Python 3.8.5, macOS 11.1.
I'm taking a peek at this code to understand how to interface with tesseract, and the code fails linter checks:
Undefined variable 'exceptions'
referring to L66:
raise exceptions.ShellError(
" ".join(args),
127,
"",
"",
)
Is the ShellError
class defined in a package that was left out of the requirements?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.