This Python script uses Tesseract OCR and regular expressions to extract specific fields from invoice images. It can be used to extract information such as invoice numbers, dates, company names, and more from invoices in a generalized format.
- Python 3.x
- Tesseract OCR
- Required Python libraries (install using
pip
):- pytesseract
- Pillow (PIL)
- Install the required Python libraries:
pip install pytesseract Pillow