Giter VIP home page Giter VIP logo

siga's Introduction

siga's People

Contributors

tongkunguan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

siga's Issues

test加载模型错误

你好,在加载模型进行test时,显示以下错误
Traceback (most recent call last): File "/content/drive/MyDrive/SRresaerch/SIGA/SIGA_R/test.py", line 223, in <module> test(opt) File "/content/drive/MyDrive/SRresaerch/SIGA/SIGA_R/test.py", line 127, in test model.load_state_dict(pretrained_state_dict['net']) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 2041, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for DataParallel: Unexpected key(s) in state_dict: "module.model_one.Transformation.GridGenerator.inv_delta_C", "module.model_one.Transformation.GridGenerator.P_hat". 请问这可能是什么导致的问题?

Generate mask costs much time

hello:
after reading your paper, I want to use the segment method to do some work also in the scene text recognition, but I find that at the training stage, to genergate the image mask cost much time, it will increase the training time. I first think to generate the mask at loacl machine but the Sythe dataset has 15M images it also will takes a lots of days to generate all the masks. so can I ask how you deal with the problem when you at the training.

code of Transformer architecture

Have you released the code of Transformer architecture? Please forgive my ignorance, it seems like I can't find it.
Additionally, the Glyph Pseudo-label Construction (GPC), Glyph Attention Network (GLAN), and Attention-based Character Fusion Mod-
ule (ACFM), I didn't find them in the code when I searched them in abbreviations. I guess you wrote them in a couple of files under the modules folder, would you offer more information about the code of them? Where are they in the code respectively? How can I find and use these three modules for ablation studies? Again, forgive my ignorance, I'm really a budding nerd, Thank you so much.

Regarding Text Mask Generation

Hello, thanks for your work. I thoroughly enjoyed reading the paper. I have a couple of questions regarding text mask generation.

  1. During the training process of the segmentation network using the labels generated with k-means, did you employ image augmentations such as random transformations and color jittering. I have faced challenges with k-means on images that have color jittering.
  2. I have also observed that for certain images predict the text pixels belong to cluster 0, while for others they are assigned to cluster 1 after performing k-means, depending on the color of the text. Could this potentially lead to challenges during the training of the segmentation model?

Datasetlink not found

The first link in the dataset seems to be inaccessible. Can it be fixed?
Also, I would like to ask about Tables 2 and 3, where some datasets under the first row are annotated with two numbers, such as 'IC13-857, IC13-1015,' etc. Does the number represent the number of samples in the test set?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.