microsoft / ocr-form-tools Goto Github PK

A set of tools to use in Microsoft Azure Form Recognizer and OCR services.

License: MIT License

Dockerfile 0.04% Shell 0.18% HTML 0.10% JavaScript 0.27% CSS 0.01% TypeScript 89.58% Python 4.44% SCSS 5.36% Procfile 0.01%

ocr-form-labeling rpa machine-learning machine-learning-algorithms form-recognizer labeling-tool typescript

ocr-form-tools's Introduction

Form OCR Testing Tool

Help us improve Form Recognizer. Take our survey!

Features Preview

An open source labeling tool for Form Recognizer, part of the Form OCR Test Toolset (FOTT).

This is a MAIN branch of the Tool. It contains all the newest features available. This is NOT the most stable version since this is a preview.

The purpose of this repo is to allow customers to test the tools available when working with Microsoft Forms and OCR services. Currently, Labeling tool is the first tool we present here. Users could provide feedback, and make customer-specific changes to meet their unique needs. Microsoft Azure Form Recognizer team will update the source code periodically. If you would like to contribute, please check the contributing section.

If you want to checkout our latest GA version, please go to Form-Recognizer-Toolkit, or use Form Recognizer Studio.
If you want to checkout our V2.1 GA version of the tool, please follow this link.
If you want to checkout our V2.0 GA version of the tool, please follow this link.

FOTT's Labeling Tool is a React + Redux Web application, written in TypeScript. This project was bootstrapped with Create React App.

Current Features of Labeling Tool: (you can view a short demo here)

Label forms in PDF, JPEG or TIFF formats.
Train model with labeled data through Form Recognizer
Predict/Analyze a single form with the trained model, to extract key-value predictions/analyses for the form.

Getting Started

Build and run from source

Form Labeling Tool requires NodeJS (>= 10.x, Dubnium) and NPM

 git clone https://github.com/Microsoft/OCR-Form-Tools.git
 cd OCR-Form-Tools
 yarn install
 yarn build
 yarn start

Set up this tool with Docker

Please see instructions here, and view our docker hub repository here for the latest container image info.

latest docker image tags track the general availability releases of FOTT.
latest-preview, docker image tags track the preview releases of FOTT.
latest-preview-private, docker image tags track the private preview releases of FOTT.

Run as web application

Using a modern Web browser, FOTT can be run directly at:

https://fott-2-1.azurewebsites.net/

Note: these web app are for testing purpose only. HTTPS is required, unless it's for localhost.

Run as desktop application

FOTT can be run as a desktop application after initial set up.

 yarn electron-start

The desktop application has additional features, such as:

Support for local file system as provider storage
Support for cross-domain resource requests

Release as desktop application

FOTT can be released as a distributable desktop application.

 yarn release

The distributable will saved in the releases folder of the cloned repository.

Using labeling tool

Set up input data

To go thru a complete label-train-analyze scenario, you need a set of at least six forms of the same type. You will label five forms to train a model and one form to test the model. You could upload the sample files to the root of a blob storage container in an Azure Storage account. You can use Azure Storage Explorer to upload data. For advanced scenarios where there are forms in quite different formats, you could organize them into subfolders based on similar format. When you set up your project to train a model from one format, you need to specify a subfolder in the project setting page.

Configure cross-domain resource sharing (CORS)

Enable CORS on your storage account. Select your storage account in the Azure portal and click the CORS tab on the left pane. On the bottom line, fill in the following values. Then click Save at the top.

Allowed origins = *
Allowed methods = [select all]
Allowed headers = *
Exposed headers = *
Max age = 200

Create Connections

Form OCR Testing Tool is a 'Bring Your Own data' (BYOD) application. In this tool, connections are used to configure and manage source (the assets to label) and target (the location where labels should be exported). The source and target are the same location in Form OCR Testing Tool. Eventually, they together will be inputs to Form Recognizer. Connections can be set up and shared across projects. They use an extensible provider model, so new source/target providers can easily be added.

Currently, both this labeling tool and Form Recognizer only support Azure blob storage.

To create a new connection, click the New Connections (plug) icon, in the left hand navigation bar.

Fill in the fields with the following values:

Display Name - The connection display name.
Description - Your project description.
SAS URL - The shared access signature (SAS) URL of your Azure blob storage container. To retrieve the SAS URL, open the Microsoft Azure Storage Explorer, right-click your container (note: not the parent storage node, not the URL in your Azure portal), and select Get shared access signature. Set the expiry time to some time after you'll have used the service. Make sure the Read, Write, Delete, and List permissions are checked, and click Create. Then copy the value in the URL section. It should have such format: https://<storage account>.blob.core.windows.net/<container name>?<SAS value>.

Create a new project

In this labeling tool, a project is used to store your configurations and settings. Create a new project and fill in the fields with the following values:

Display Name - the project display name
Security Token - Some project settings can include sensitive values, such as API keys or other shared secrets. Each project will generate a security token that can be used to encrypt/decrypt sensitive project settings. You can find security tokens in the Application Settings by clicking the gear icon in the lower corner of the left navigation bar.
Source Connection - The Azure blob storage container connection you created in the previous step that you would like to use for this project.
Folder Path - Optional - If your source forms are located in a sub-folder on the blob container, specify the folder name here
Form Recognizer Service Uri - Your Form Recognizer endpoint URL. It should have such format: https://<your-name>.cognitiveservcices.azure.com.
API Key - Your Form Recognizer subscription key.
Description - Optional - Project description

Label your forms

When you create or open a project, the main tag editor window opens. The tag editor consists of three parts:

A preview pane that contains a scrollable list of forms from the source connection.
The main editor pane that allows you to label text by applying tags.
The tags editor pane that allows users to modify, reorder, and delete tags.

Identify text elements and tables

Click Run OCR on all files on the left pane to get the text layout information for each document. The labeling tool will draw bounding boxes around each text element and display an icon at the top left corner of each table. You can click a table's icon to display that table's identified borders

Apply labels to text

Next, you'll create labels and apply them to the text elements that you want the model to recognize. There are many key-value pairs in a document you would like to train a model to extract, the first step is to label the value of the key-value pair. For example, you see text Charge: 1002.00 in a form, and you would like to label the value (1002.00), so that AI model could be trained to extract such information on similar forms.

First, use the tags editor pane to create the tags (labels) you'd like to identify, e.g, "Cost".
In the main editor, click and drag to select one or multiple words from the highlighted text elements. e.g., "1002.00". Note: You cannot currently select text that spans across multiple pages.
Click on the tag you want to apply, or press corresponding keyboard key (e.g., key '1' for the first tag). You can only apply one tag to each selected text element, and each tag can only be applied once per page.

Follow the above steps to label five of your forms, and then move on to the next step.

Specify tag type and format

You can specify tag type and format with tag contextual menu. The type and format information will be stored in fields.json in the source location. The information will be used in post-processing to get better result.

Train a custom model

Click the Train icon on the left pane to open the Training page. Then click the Train button to begin training the model. Once the training process completes, you'll see the following information:

Model ID - The ID of the model that was created and trained. Each training call creates a new model with its own ID. Copy this string to a secure location; you'll need it if you want to do prediction/analysis calls through the REST API.
Average Accuracy - The model's average accuracy. You can improve model accuracy by labeling additional forms and training again to create a new model. We recommend starting by labeling five forms and adding more forms as needed.
The list of tags, and the estimated accuracy per tag.

After training finishes, examine the Average Accuracy value. If it's low, you should add more input documents and repeat the steps above. The documents you've already labeled will remain in the project index.

Tip: You can also run the training process with a REST API call. To learn how to do this, see Train with labels using Python.

Analyze a form

Click on the Analyze icon on the left pane to open the Analyze page. Upload a form document that you haven't used in the training process. Then click the Analyze button on the right to get key-value predictions/analyses for the form. The tool will highlight fields and its bounding boxes and will report the confidence of each value.

Tip: You can also run the Analyze API with a REST call. To learn how to do this, see Train with labels using Python.

Compose a model

Click the Compose icon on the left pane to open the Compose page. FoTT will display the first page of your models—by decending order of Model ID—in a list. Select multiple models you want to compose into one model and click the Compose button. Once the new model has been composed, it's ready to analyze with.

To load more of your models, click the Load next page button at the bottom of the list. This will load the next page of your models by decending order of model ID.

You can sort the currently loaded models by clicking the column headers at the top of the list. Only the currently loaded models will be sorted. You will need to load all pages of your models first and then sort to view the complete sorted list of your models.

Save a project and resume later

To resume your project at another time or in another browser, you need to save your project's security token and reenter it later.

Get project credentials

Go to your project settings page (document setting icon) and take note of the security token name. Then go to your application settings (gear icon), which shows all of the security tokens in your current browser instance. Find your project's security token and copy its name and key value to a secure location.

Restore project credentials

When you want to resume your project, you first need to create a connection to the same blob storage container. Repeat the steps above to do this. Then, go to the application settings page (gear icon) and see if your project's security token is there. If it isn't, add a new security token and copy over your token name and key from the previous step. Then click Save Settings.

Resume a project

Finally, go to the main page (house icon) and click Open Cloud Project. Then select the blob storage connection, and select your project's .proj file. The application will load all of the project's settings because it has the security token.

Share a project

FoTT allow sharing of projects with somebody who use the tool too and share access to same Azure Blob Storage container where a project is located. In order to share project follow these steps:

On the sending side:

Open the project you want to share in the tool. In the right top corner find and click "share" icon . You should see a message pop-up saying that your share string been saved in your clipboard.
Share the string in you clipboard via any convenient way for you.

On the receiving side:

Go to the "Home Page", and click on "Open Cloud Project".
Insert shared string to appropriate field in the pop-up.
Click okay.

Keyboard Shortcuts and useful tips

Labeling tool allows a number of keyboard shortcuts to support accessibility and also sometimes make labeling easier and faster. You can view them by clicking the following icon on the right side of the title bar:

Hotkeys of 1 through 0 and all letters are assigned to first 36 tags, after you selected one or multiple words from the highlighted text elements, by pressing these hotkeys, you can label the selected words.

'[' and ']' keys can be used move the selection to the previous or the next word.

'<' and '>' keys can be used go to the previous or the next page in multi-pages documents.

'-', '+' and '/' keys can be used to zoom in/out and reset zoom of editing page.

Hold Alt key and click on tag name, user can change the tag's name.

To delete all labels for a tag - select all labels for a tag on document then press 'delete' key.

Collaborators

This project is cloned and modified from VoTT project.

Contributing

There are many ways to contribute to Form OCR Testing Tool -- please review our contribution guidelines.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

ocr-form-tools's People

Contributors

Stargazers

Watchers

Forkers

hkamel fanfeilong zhangyangisme weiplanet xingboliu everhopingandwaiting jingmouren deadfoxwang buaanamenotfound oofoegbu insightdi olivierdolle m-a-n-a-v patrickfarley orkjessome aleynovsergey dhanhiyel shankatgithub duzhanyuan luoj-roger v-khdumi nikitabasu existimatio eduardo-silla samanulla summon-ml akdigilytics wiona913 sramolefo xibeiwind ms-nini yongbing-chen taffywrinkle insycaops claudiusgonzo thexur karthik623521 atotalnoob lhuang simplesoftmx larkinnjm1 one2world zollyboldizsar sethjuarez maneeshs alexvarrese devolawunmi thdotnet keyoke shankat2020 page2me gechunqiang vinayasathyanarayana faming imicknl vinodkiran devbox10 anildwarepo smfaulkner buddhawang rocke2020 chessyhsu benm5678 sam961124 simotw cash0202 robert-n-stewart liuwenhaha zhengdeding yonet global-localhost global19 global19-atlassian-net vijayakarumudi ceteongvanness cheahengteong mela00-dev kabelsistemasinformacion rvila7 isabella232 reloadbrain b06902044 briansu2004 anzrz-inc kulov kenchan13 yijingliedge alegarro kamaldeep786 cennis-endpoint prpankajsingh nfajardo infoxin v1innovationlabs hdlopeza v-yuhang vnextcoder garora laujan seealgo

ocr-form-tools's Issues

FOTT bug report - Deselect tag content doesn't work

Describe the bug
After you have set the tag for a highlighted part of text, it can not be removed easily.

To Reproduce
Steps to reproduce the behavior:

Highlight a part of the text
Apply the text to a tag
Try to remove the selected words from this tag, without reapplying it to another tag.

Expected behavior
Have a way to clear a tag, without deleting the tag. Currently I assign the selected text to another new tag, which I than remove. This should be way easier.

Desktop (please complete the following information):

OS: Mac
Docker version: latest 1.0.0 and 2.0.0-202fb2f
Browser: Edge Chromium Canary (Mac)
Version: 84.0.488.0

FOTT bug report - still has non-numeric text/space after label as numeric

When labeling customer specified one field as numeric .
They were expecting to receive data in numeric format that means:
no additional strings and no whitespaces .
When testing it on their sample they found that for some cases they Form Recognizer Service retrieves extra string as well and the figures have whitespaces between them (as they were an array of numbers)
expectation:
those noise was ruled out by the type specification .
For example they receive “UMH 00181” or “ 0 02 81”. they consider this not honoring the type specification during label time.

Change selected asset before Ocr running finished would throw exception witdth cannot be null

Easily create sample code based on user's config

scenario
• After trying the whole end-to-end scenario, a user already provides all necessary parameters and info to finish an end-to-end operation with code. I wish there is a button of “turn this into python code”. It will produce a python code which:
○ Loads the labeled files
○ Train
○ Get the train result
○ Predict use the file path (one file or multiple files)
○ Display the prediction result
○ Reference of Python code: https://docs.microsoft.com/en-us/azure/cognitive-services/form-recognizer/quickstarts/python-labeled-data

Suggestion: Store thumbnail image in blob storage for speed

Is your feature request related to a problem? Please describe.
When dealing with lots of documents the time to see thumbnails can take a bit as you scroll in the list of documents. The document (PDF in my case) has to be loaded client side and a thumbnail generated.

Describe the solution you'd like
Store the generated thumbnail and just load it instead. This would be much faster as just a straight image.
Optionally - flag in the setup to 'Store thumbnails in blob storage'.

add a 'keyboard shortcut overlay' to show shortcuts and other tips

now: some of the shortcut keys are hard to find or remember.

expect: have a shortcut & tips page, user could easily toggle on/off.

simple improvement of the analyze sample code

code: https://github.com/microsoft/OCR-Form-Tools/blob/master/public/analyze.py

improvements:

add more comments, e.g. Project info, Copyright info, comment for each funtion.
add comment about the commandline parameter

optional:
3) add code to analyze all files in one local directory and save the result in another directory (with commandline parameters)

---- this could be done later ----
4) improve the 3) algorithm with multi-thread process so that the code could process 1000 of files.

Suggestion: Add lasso selection functionality to Form Labeling tool

copied from https://cognitive.uservoice.com/forums/921556-form-recognizer/suggestions/39850153-add-lasso-selection-functionality-to-form-labeling

It would be really nice to have a "Lasso" selection functionality in the form labeling tool. I know I can hold the left mouse button down to highlight multiple words but there are some areas of forms (e.g. remarks) where we have several hundred words that have to be selected. They usually are in a rectangle shape so hence a lasso selection would work perfectly.

support Checkbox labeling

layer
checkbox should have its own layer, only show checkbox UI when the layer is activated.

**UI **
have simple UI to support select the checkbox

field type support
checkbox should have its own fieldtype.

pass such value to backend for training

show result
in this UI, show training accuracy info and analyze result, and analyze result confidence info.

Disable user changes back tag type to text if it is labelled by a checkbox and vice versa

If an tag has labelled a checkbox, it's type should be stick as checkbox. According to logic and design, we may not allow user to change it back text

Only show preview warning when there is checkbox

Support URL source input for predict file upload

renaming "checkbox" to "selectionMark"

Since we're planning to introduce new feature Selection Group - "checkbox" needs to be renamed to "selectionMark".

Prototype: receipt analyze

Receipt is a pre-built model, it's a good demo scenario towards building a comprehensive demo for FR.

good UI, user could know where to get to this demo, and how to return back to do other things (home UI)
good demo quality, could we have recommended demo files or URLs so that users could see the expected successful result? we don't want to see users struggling to find a demo image from their PC and end up with less-than-ideal result.
consistent. do we also provide a "project"? what's in the project? do we need SaS token for blob storage account?
partner with receipt team, after we finish the technical work, we could sync up with receipt team and try to deliver a demo they really want to show to customers.

FOTT bug report: project creation page keeps restoring old and incorrect settings

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:
Let chrome remember user's form input, chrome keeping restore those values for user.

Expected behavior
Some values should not be auto-filled

FOTT bug report - need link to sample file set

need a link to the sample file set, so that user could download / upload to blob storage and later testing.

Region category is not updated while updating tag type

Suggestion: Copy fields to another project

Is your feature request related to a problem? Please describe.
When working on multiple models with the same fields, it is a hassle to create them over and over again. I have tried copying the fields.json to another project, however that file will be overwritten with an empty one directly..

Describe the solution you'd like
A method to copy fields to another model project, or at least the possibility to upload it on the blob storage.

Asset state doesn't change after running OCR on it

FOTT bug report: prediction page can't keep the result

label and train a model
predict on a file, see the result UI
click on the labeling page to see some labels
return to prediction page

result: everything is gone
expect: keep the content of last prediction run.

show a visualized analyze result page and field raw value vs. post-processed value

Is your feature request related to a problem? Please describe.
after getting the result, I couldn't see the result in the UI, just like labeling UI

Describe the solution you'd like
after I get the analyzed result, I'd like to see the fields being highlighted, also show the value extracted. It might be different from the value in the image/pdf, for example, I specified the field type as "no-whitespace", I'd like to compare the raw data vs. the analyzed result.

Describe alternatives you've considered
I have to write my own code to show this, which is re-inventing the wheel, as FOTT has such code already in labeling process.

Use higher contrast highlighting in the Form Labeling tool, need more visual indication

copied from uservoice:
Using different shades of green for words that are selected vs just highlighted from OCR does not provide enough visual indication. For those of us that are color blind it's difficult to see the subtle difference. You might want to check out Microsoft's Accessibility Insights tool.

UI improvement for "Predict" page

UI change for "Predict" page:
change the "Predict" to analyze
• Rename heading text to “Upload file and analyze” so it’s clearer what the overall user action is.
• Separate and change “Result” heading to “Results”
• Add a label for Page # / Field name / Value plus Confidence %
• Remove “P.” from page number. Add tooltip if necessary.
• Add horizontal line and move the “Download result (JSON)” button down to the bottom since it’s a secondary action.
• Add space (4px) between Analyze icon and Analyze text at the top.

also, have a button for "Generate sample code"

in our docs, we should make sure the word Predict will be changed to Analyze over time. we could keep Predict in some places for backward compat.

Tag items sometimes would flash to wrong place when moving it up and down

Fix the bug sometimes when moving tag items up and down, the moving item will flash back and forth in passed position

bug: break in sidebar

Describe the bug
On 'create new project' page when viewed on non-Hi-Res screens (1980p and bellow) or when user has zoom-in/scale-up in browser settings on Hi-Res-screen - the sidebar does not extends all the way to the bottom of the screen. There’s a gap between the sidebar and status bar (look at screenshots).

To Reproduce
Steps to reproduce the behavior:

Go to 'initial screen '
Click on 'New Project '
Increase the scale or zoom-in little bit, then scroll down
See the error on the left.

Expected behavior
No breaks in sidebar.

Screenshots

Desktop (please complete the following information):

OS: any
Docker version: not tested
Browser: any
Version: 2.0.0-bce554e

Additional context

Ease of Use - links to other helpful info

more links

• We need some accessory features like sending feedback, link to FAQ, etc.

feedback link: https://github.com/microsoft/OCR-Form-Tools/issues/new/choose
FAQ: https://stackoverflow.com/tags/form-recognizer

• Have a "what's new" link/page to show the announcements we push to users.

What's new: for each month's release. e.g. for April release, we should have a simple page describing the new features - Checkbox support.

Support checkbox group

Elements can still be added to tag which has different type of elements

Wish: making one tag only contains one type of elements. Either text or checkbox, since which is more intuitive according to our logic

Keep downloaded prediction JSON result the same as API result.

Support model compose

FOTT bug report: need better error message for CORS

we need to configure cross-domain resource sharing (CORS) before we could use FoTT, but, several users are stuck due to CORs not enabled on the blob. the error message is not clear. it says:

Failed to send request to http://...

expect:
Any chance we can make the error more informative and tell the user CORs is not enabled so that they can resolve it on their own, no need for support ?

support Model Compose feature

This feature needs to align with model compose API release date.

basic feature:

list models in a subscription
view model basic info
could select several model and compose a new model
given an doc (pdf or image), could run prediction on a selected model. (this is the same as today)

Show table result in prediction page.

Performance: Use listBlobHierarchySegment to list file in blob folder to increase performance

FOTT bug report: Prediction page does throw TypeError `getResolutionForZoom`

Describe the bug
Predict page throws TypeError, Cannot read property of 'getResolutionForZoom' of null.

To Reproduce
I am having a hard time reproducing it now when I create this issue. However, it did happen a few times today already.

Steps to reproduce the behavior:

Go to predict page
Upload file and press 'predict'
???

It has something to do with zooming in and resizing the screen. I will update this issue when I have more details.

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots

Desktop (please complete the following information):

OS: Linux, Azure WebApps for Linux
Docker version: unknown
Browser: Edge Chromium Canary
Version: 82.0.459.1

Saving/creating a project in project settings takes time to save and doesn't show that it's saving

Select tagged element immediately after renaming the tag

handle Long tag names well when it exceeds the visible area.

copied from uservoice:
If you use long tag names it exceeds the visible area. It would be nice to be able to expand the tag name area via a splitbar.

easy re-use of field names when labeling for similar models.

user wish: I am creating 10 invoice models for 10 providers one needs to input the labels per project 10 times and make sure he is labeling them the same names.

“Labels are ‘common’. I want the same labels available for every model. We could re-type them in for each one – but they all have the same types of labels – and if it does not have that label on that document type we just don’t label it. Just want to make sure one person does not label it as ‘ExpirationDate’ and then on another model it is called ‘Expiration’. I need those consistent when I get the data back from the models to be able to process it!

wish: in near future:
same user could re-use a set of commonly used field names.
could import a file which contains a list of field names.

wish: future: (need more analysis)
share such common field names between different users.

Suggestion: Add document management

Is your feature request related to a problem? Please describe.
Add ability to add or remove a document from the list. As it is now, you have to go directly to blob storage and upload/delete from there instead of this nice UI.

Describe the solution you'd like
A button to add a document and a way to delete an existing doc. (This would delete the ocr/label data that goes with it)

Describe alternatives you've considered
Separately uploading deleting through another UI seems like the only alternative.

auto suggest field type when labeling

right now, I need to label a field and then pick the field type from the dropdown list, for every single field.

expect:
since I had clicked on the field value, from the value, it's relative easy to infer the field type. could the tool automatically set the field type for me?

for ambiguous field types, the tool could pick one of them

once user makes a manual selection, respect that selection.

Suggestion: Perform OCR on every document in list ahead of time

Is your feature request related to a problem? Please describe.
As I click on each document that has had labels applied - it must perform OCR on it and I must wait for it to finish. Instead of that - a button (or just do it) to go ahead and perform OCR on all documents that need it in the background would be nice.

Describe the solution you'd like
An option in the setup to 'Perform OCR on all documents without OCR data automatically in background'. If that is on, it does it - if it is off - it works like it does today.

Optionally - you could have a button/option that needs to be pressed that would do the same action.

Describe alternatives you've considered
Wait as you click each document for the OCR is a bit painful when it could have been easily done while I was labeling. The only alternative I know would be to write some scripts to do this myself, which seems a bit much considering this UI already does most of it.

Suggestion: Process Folders in blob storage

Is your feature request related to a problem? Please describe.
The number of files in a model can be quite large when handling many document types - like invoices, etc. Having to maintain all the files in one big list with unique file names, etc is difficult.
For each company we deal with - they send around 7 different types of documents. For each type of document we need 5 samples. So with only 10 companies we are already managing 350 files!

Describe the solution you'd like
Use files in folders for labeling and model building.
At a minimum, that would mean displaying them just like any other root folder item.

Optionally, the UI could represent folders on the thumbnail view and only show files from that one folder (or show all).
Optionally, the setup could have a comma separated list of folders to include or exclude, and or take multiple glob expressions.

Describe alternatives you've considered
Without this end up with 1000s of files to manage in one big folder.

put version / build information and contact information in the "setting" page

wish:

major version: tie to major API version (2.0)
minor version: the date of build 2020.04.08
build: we need to append the raw build commit
source code: github project URL
contact info: aka.ms/formrecognizer

Predict multiple files in one directory, or Azure blob storage

the current tool could only run the prediction on one file, could it run prediction for multiple files?

for example, user could specify a blob storage, or a local file path, the tool will then load all files and run them thru prediction call one by one.
At the end, the tool could share the overall accuracy of process multiple files, also the individual result, if needed.

FOTT bug report - Reordering a tag quickly does not work

Describe the bug
In the Tags editor page, I have around 20+ tags and want to reorder one that is at the bottom of the tag list for labelling efficiency.
When I click on the arrow to reorder it several times in a row the tag doesn't go at all where expected.
It renders the reordering of labels for project with large number of label practically undoable.

To Reproduce
Steps to reproduce the behavior:

Go to Tags Editor page
Create around 10 tags
Select a tag at the bottom of the list and try to move it up to first place
See that the tag moves up and down on its own for a few seconds after last clicking on the Move Tag up arrow

Expected behavior
Reordering of tags working smoothly

Desktop (please complete the following information):

OS: macOS Catalina 10.15.4
Docker version: Docker Desktop Community 2.2.0.5
Browser: firefox
Version : latest docker image (ImageId: 23ba43da7425)

Additional context
Azure Storage is in France Central, Form Recognizer in West Europe

have version number linking to the change log file

make the version number in the bottom-right corner a clickable link,
pointing to https://github.com/microsoft/OCR-Form-Tools/blob/master/CHANGELOG.md

Lost Highlight of label while browsing on it

Bug reported by Nini

External Project Management Support.

Is your feature request related to a problem? Please describe.
I need a way to setup a project programmatically so a end user only has to tag images. I also need this solution to scale for thousands of projects.

Describe the solution you'd like
A page that could take a project security token, connection, and project as parameters and drop the user into the tagging UI would be best. A way to create the projects through an api in azure would also be helpful.

Describe alternatives you've considered
My current plan is to give my end users the keys to create their own projects and have them upload images through a custom ui I've built. After they've created a model I'll have them past the model id back into my application.

Additional context
I need my end users to manage the creation of around 11,000+ models to extract data from different formats.

regularly check - "what's new"

we have weekly update of stable builds, and sometimes daily update to fix issues.
we want users to know we have new features and important fixes.

it seems most people would just re-use the same webpage instance, it would be nice for the web page to check and show "what's new" in the appropriate places.

this also enables a simple heart-beat count so that server could know how many instances are out there without any user info.

microsoft / ocr-form-tools Goto Github PK

ocr-form-tools's Introduction

Form OCR Testing Tool

Features Preview

Getting Started

Build and run from source

Set up this tool with Docker

Run as web application

Run as desktop application

Release as desktop application

Using labeling tool

Set up input data

Configure cross-domain resource sharing (CORS)

Create Connections

Create a new project

Label your forms

Identify text elements and tables

Apply labels to text

Specify tag type and format

Train a custom model

Analyze a form

Compose a model

Save a project and resume later

Get project credentials

Restore project credentials

Resume a project

Share a project

On the sending side:

On the receiving side:

Keyboard Shortcuts and useful tips

Collaborators

Contributing

ocr-form-tools's People

Contributors

Stargazers

Watchers

Forkers

ocr-form-tools's Issues

Recommend Projects

Recommend Topics

Recommend Org