herobd / fudge Goto Github PK

View Code? Open in Web Editor NEW

33.0 33.0 6.0 4.37 MB

Code for the ICDAR2021 paper "Visual FUDGE: Form Understanding via Dynamic Graph Editing"

License: GNU General Public License v3.0

Python 100.00%

fudge's People

Contributors

Stargazers

Watchers

Forkers

cqray1990 gabriellavoura malcolmgreaves gztangde jgyy4775 alkun25

fudge's Issues

Evaluation config

Hello.

I am trying to run an evaluation, or the eval.py file for benchmarking.
The problem is that I am not sure how to config metric to get the evaluation score.
Every pretrained and example config seems to have "metrics": [], and eval.py log nothing when run.

How do I config the metric? and what are the metrics available options?

Thanks. Appreciate any help.

Some Doubts

Hi @herobd

Thanks for the open sourcing the code. The Paper was a wonderful read, showcasing great work.
I had a couple of doubts, which can help me try this approach in a better informed way.

Can you specify the hardware(GPU & memory) you used for training so that I can estimate a similar time for me ?
As I understand, can we safely say that, we are doing node classification here as each node is getting classified into a semantic class ?
If yes, should we try to compare results with SROIE key value pair classification as well ?
For word-level FUDGE, if I want to bring in a custom dataset, what would be your suggestions and input ?
If I want to do a graph classification as well, would you recommend just concatenation node features and pass it through FC layers or somehow include edge information too ?

Thanks in advance !

Change of backbone

Hello @herobd
Thanks for your awesome paper, it was something new for this kind of problem! However I'm curious about backbone that you used in this work. YOLOv2 seems outdated nowadays so what is your thoughts about adopting YOLOv5 architecture for this task? Is it good idea or not? I have a desire to reproduce your idea but with new backbone and wanted to hear what do you think about it?

Thanks in advance !

Linking for distant words

Hey @herobd ,
I was trying to extract the relations for keywords using the pretrained weights and your code.

But for distant boxes it doesnt seem to identify the linking, is there anyway to solve this?

Steps to evaluate

I have tried to install liabrary and run eval.py file. But it's not going forward to evaluate.

!python eval.py -c /content/FUDGE/saved/FUNSDLines_detect_augR_staggerLighter.pth -n 10 -d /content/

logs

loaded FUNSDLines_detect_augR_staggerLighter iteration 1
/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py:481: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
  cpuset_checked))
Trainable parameters: 2364154
YoloBoxDetector(
  (_hack_down): Sequential(
    (0): Sequential(
      (0): Conv2d(1, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
      (1): GroupNorm(8, 32, eps=1e-05, affine=True)
      (2): Dropout2d(p=0.1, inplace=True)
      (3): ReLU(inplace=True)
    )
    (1): Sequential(
      (0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (1): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (2): GroupNorm(8, 64, eps=1e-05, affine=True)
      (3): Dropout2d(p=0.1, inplace=True)
      (4): ReLU(inplace=True)
      (5): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (6): GroupNorm(8, 64, eps=1e-05, affine=True)
      (7): Dropout2d(p=0.1, inplace=True)
      (8): ReLU(inplace=True)
    )
    (2): Sequential(
      (0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (1): Conv2d(64, 128, kernel_size=(1, 3), stride=(1, 1), padding=(0, 2), dilation=(2, 2))
      (2): GroupNorm(8, 128, eps=1e-05, affine=True)
      (3): Dropout2d(p=0.1, inplace=True)
      (4): ReLU(inplace=True)
      (5): Conv2d(128, 128, kernel_size=(3, 1), stride=(1, 1), padding=(1, 0))
      (6): GroupNorm(8, 128, eps=1e-05, affine=True)
      (7): Dropout2d(p=0.1, inplace=True)
      (8): ReLU(inplace=True)
      (9): Conv2d(128, 128, kernel_size=(1, 3), stride=(1, 1), padding=(0, 4), dilation=(4, 4))
      (10): GroupNorm(8, 128, eps=1e-05, affine=True)
      (11): Dropout2d(p=0.1, inplace=True)
      (12): ReLU(inplace=True)
      (13): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (14): GroupNorm(8, 128, eps=1e-05, affine=True)
      (15): Dropout2d(p=0.1, inplace=True)
      (16): ReLU(inplace=True)
    )
    (3): Sequential(
      (0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (1): Conv2d(128, 128, kernel_size=(1, 3), stride=(1, 1), padding=(0, 4), dilation=(4, 4))
      (2): GroupNorm(8, 128, eps=1e-05, affine=True)
      (3): Dropout2d(p=0.1, inplace=True)
      (4): ReLU(inplace=True)
      (5): Conv2d(128, 128, kernel_size=(3, 1), stride=(1, 1), padding=(1, 0))
      (6): GroupNorm(8, 128, eps=1e-05, affine=True)
      (7): Dropout2d(p=0.1, inplace=True)
      (8): ReLU(inplace=True)
      (9): Conv2d(128, 128, kernel_size=(1, 3), stride=(1, 1), padding=(0, 8), dilation=(8, 8))
      (10): GroupNorm(8, 128, eps=1e-05, affine=True)
      (11): Dropout2d(p=0.1, inplace=True)
      (12): ReLU(inplace=True)
      (13): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (14): GroupNorm(8, 128, eps=1e-05, affine=True)
      (15): Dropout2d(p=0.1, inplace=True)
      (16): ReLU(inplace=True)
    )
    (4): Sequential(
      (0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (1): Conv2d(128, 256, kernel_size=(1, 3), stride=(1, 1), padding=(0, 8), dilation=(8, 8))
      (2): GroupNorm(8, 256, eps=1e-05, affine=True)
      (3): Dropout2d(p=0.1, inplace=True)
      (4): ReLU(inplace=True)
      (5): Conv2d(256, 256, kernel_size=(3, 1), stride=(1, 1), padding=(2, 0), dilation=(2, 2))
      (6): GroupNorm(8, 256, eps=1e-05, affine=True)
      (7): Dropout2d(p=0.1, inplace=True)
      (8): ReLU(inplace=True)
      (9): Conv2d(256, 256, kernel_size=(1, 3), stride=(1, 1), padding=(0, 16), dilation=(16, 16))
      (10): GroupNorm(8, 256, eps=1e-05, affine=True)
      (11): Dropout2d(p=0.1, inplace=True)
      (12): ReLU(inplace=True)
      (13): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (14): GroupNorm(8, 256, eps=1e-05, affine=True)
      (15): Dropout2d(p=0.1, inplace=True)
      (16): ReLU(inplace=True)
      (17): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (18): GroupNorm(8, 256, eps=1e-05, affine=True)
      (19): Dropout2d(p=0.1, inplace=True)
      (20): ReLU(inplace=True)
    )
    (5): Conv2d(256, 250, kernel_size=(1, 1), stride=(1, 1))
  )
)
valid metrics

Can you please provide the full steps about how to use model and evaluate also if possible please describe how to test on single image.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.