Giter VIP home page Giter VIP logo

ocr-expense-report's Introduction

OCR Expense Report

๐Ÿ”ฌ Study project from FIAP MBA Software Engineer phase 3. OCR with to extract information from invoices for expense reports.

Problem

Currently, the company has an OCR (Optical Character Recognition) solution that successfully performs OCR on traditional documents with a white background and dark characters. This solution works offline, without the need to consult third-party services.

During the development of a new project for a corporate reimbursement platform, the software proved ineffective in reading hypermarket tax coupons. As most tax receipts have noise levels (poorly printed and defective regions), yellow background, in addition to the possible existence of image alignment errors, the technology currently used was ineffective, as it was unable to work with this type of noise , as it was modeled for traditional documents.

Therefore, adjustments need to be made so that it is possible to perform this type of task, improving computer vision techniques for handling noise and OCR.

The objective of the task is to implement a solution that addresses the OCR of tax coupons and extracts from the text obtained the information listed in the customer's needs. The OCR process should preferably be done offline, that is, without using third-party cloud services and the use of opensource libraries is allowed (and recommended) for image manipulation, OCR and extracting information in the text . Students are free to research and use their creativity to solve the problem.

The software needs to be trained to obtain information on tax receipts of various types and that are not in ideal conditions for reading, such as shadow, low light, brightness, etc.

Solution

Infrastructure

  • Deploy on the application server at Vercel
  • Integration with Airtable
  • Integration with S3 for uploading images

Application

  • Installation of Boilerplate Refine
  • Listing and Exclusion Page
  • Creation page
  • Edit page
  • Details page
  • Text extraction by OCR
  • Save extracted content in Markdown table format

Others

  • Authentication
  • Expense report build
  • E-mail expense report

Getting Started

This software solution uses the following services and technologies:

  • Next.js as React Framework with support to server-side renderer
  • Refine interface/solution library to accelerate software development
  • Vercel for hosting and CI/CD
  • Airtable as database with integrated API's provided
  • AWS S3 (storage), IAM (security) and Textract (machine learning)

To run this solution in your local machine:

git clone https://github.com/vrbarros/ocr-expense-report.git

Create a .env file with the following variables:

NEXT_PUBLIC_AIRTABLE_API_TOKEN=
NEXT_PUBLIC_AIRTABLE_BASE_ID=
AWS_S3_BUCKET_NAME=
AWS_S3_ACCESS_KEY=
AWS_S3_SECRET=
AWS_S3_USER=
AWS_S3_REGION=

You should fill the variables after setting up each one of the services.

Airtable

  • Create a new workspace with a table called receipts and the columns

    • name: Text
    • notes: Long text
    • attachments: Attachment
    • status: Single select
    • officialName: Text
    • total: Currency
  • Get your API key in your account page, generating a new API key if necessary. Add to NEXT_PUBLIC_AIRTABLE_API_TOKEN variable at your .env file.

  • Get your base ID from your table at Airtable API page. Add to NEXT_PUBLIC_AIRTABLE_BASE_ID variable at your .env file.

AWS S3

  • Create a new S3 bucket with public read permission
  • Set a custom Bucket Policy with the following settings:
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "PutObject",
            "Effect": "Allow",
            "Principal": "*",
            "Action": [
                "s3:Put*",
                "s3:Get*"
            ],
            "Resource": "arn:aws:s3:::YOUR_BUCKET_NAME/*"
        }
    ]
}
  • Set a custom CORS settings with the following context:
[
    {
        "AllowedHeaders": [
            "*"
        ],
        "AllowedMethods": [
            "PUT"
        ],
        "AllowedOrigins": [
            "*"
        ],
        "ExposeHeaders": [],
        "MaxAgeSeconds": 3000
    }
]
  • Use your bucket name to set the AWS_S3_BUCKET_NAME and AWS region to AWS_S3_REGION.

AWS IAM

  • Create a new AWS IAM user with controlled policy for security reasons
  • Add Permissions policy to the new user
    • AmazonS3FullAccess
    • AmazonTextractFullAccess
  • Generate a new user Access Key and Secret to use with AWS_S3_ACCESS_KEY and AWS_S3_SECRET
  • Set the AWS_S3_USER with your username created

AWS Textract

AWS Textract easily extract text and data from virtually any document.

How to Run

Using Yarn you can run this application locally with:

  1. Install application dependencies:
yarn
  1. Start the development server:
yarn dev

References

Authors

  • Victor Barros
  • Chong Chung Lan

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.