j-rausch / dsg Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
Is there any setting that can be added during the execution of demo.py to include the div for the graphic figures contained within the image in the hOCR, or is it automatic? Because I'm testing it on various images, and currently, it's not being detected. However, I noticed that it's present in sysdemo.
Half the repos in the installation process are not found, pytorch 1.11 is no longer served by pip, neither is the numpy version.
Is there a simpler installation process to your knowledge? I would really like to try your code but the wall of installs is preventing me from going forward.
Thank you for publishing this great work.
I successfully run a demo using the Eperiodica dataset.
But, I couldn’t find the arXiv dataset in your Google Drive link.
Is it possible to publish the arXiv dataset?
Thank you in advance.
Hi, Thank you for the great work.
I see that the repo only has Prediction/demo codes available. Can you please add Finetuning code and Custom Dataset format that the model accepts. Can you also please add all details required to custom train on the datasets.
Thank you
Hi, how can I get the text of a image through OCR? Is there already something implemented, or do I need to do it outside of this project?
Hi, and thank you for this great project.
I am trying to test on my own image, and I don't know how to generate the accompanied .txt file for the image.
I see there is one in the sysdemo folder.
How do I generate this file?
Thank you.
Thanks for your great project! I would like to utilize the datasets you've provided for my research. However, I'm encountering some difficulty in interpreting the rules for relationship annotation within the provide files. Could you please clarify the annotation guidelines for me?
Why can a figure have so many parents? Also why can a row be the parent of so many entities?
Hi, thank you for this great project!
I'm trying to test my own images and I got OCR file using Tesseract, but the format is different from the OCR file in sysdemo.
What do the two numbers at the beginning of the OCR file in sysdemo mean? And are the four numbers following each word the coordinates?
I look forward to your reply.
Hi, Johannes, thanks for your great project! I want to follow your datasets and work to do research. However, I haven't found the evaluation scripts to reproduce the results in DSG paper.
I tried to modify the training scripts like:
CUDA_VISIBLE_DEVICES=0 python scripts/train_doc_SG_head.py \ --config-file ./configs/sgg_end2end_EP.yaml \ --num-gpus 1 \ --eval-only \ --resume \ MODEL.ROI_SCENEGRAPH_HEAD.PREDICT_USE_VISION True \ OUTPUT_DIR ./output/eval \ MODEL.WEIGHTS ./checkpoints/DSG_E2E_eperiodica/dsg_e2e_eperiodica_checkpoint.pth
However, it reported a bug:
TypeError: _evaluate_predictions_on_coco() got an unexpected keyword argument 'use_fast_impl'
I do not know what is the correct command.
Thanks! Look forward to your reply!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.