Giter VIP home page Giter VIP logo

kiss's People

Contributors

bartzi avatar dependabot[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kiss's Issues

[Help] Using Pretrained Model

Thanks for sharing the model. I just want to test the pertained model that you provided. Do I still need to download the image data (SynthText/MjSynth) if I'm using the pretrained model? And If not then how can I get run the pertained model on testing datasets like cute80, idcars etc. I have already downloaded the datasets (cute80, idcar2013, idcar2015, iiit5k, svt, svtp) and their respective npz files. How can I run the evaluation on these datasets?

Change the num_words_per_image without training again

Can we predict multiple words from a single image by changing the num_words_per_image?
I tried changing in recognizer_class in evaluate.py file but facing this error.

InvalidType: 
Invalid operation is performed in: Reshape (Forward)

Expect: prod(x.shape) % known_size(=3072) == 0
Actual: 1536 != 0

Also, Can I know why spaces are not there in char_map? ( this may solve to predict multiple words in image)

mjsynth.npz only has first letter of each word in "text"

I would like to start by saying I really enjoyed reading your paper and I am currently porting it to Pytorch.

I was going through the steps to download the synth data (outlined on your github) and during the filter_word_length I noticed that the "text" array in mjsynth.npz only contains the first letter of each word in the dataset. Is there a reason for this? Thank you.

The data link is broken

微信图片_20210919173957

We can't find the original data, so we don't know their original appearance and filename。 Therefore, we don't know how to rename and how to arrange the path.

Windows fatal exception: Access violation

Hello,

When I try to train the network with train_text_recognition.py using my own images,
I have a Windows fatal exception: access violation.
This is followed by several treads, mostly comming from chainer, tensordboard and multi_node_mean.
Do you have any idea where that could come from?

Thank you by advance,
Clément

link to download SynAdd dataset.?

I cannot register baidu account, so I am not able to download this dataset.
Could anyone who downloaded send me another link to download?
Thanks

How do you generate the mask in transformer model and process text labels to "class_id" ?

As the project is reall huge, I'm not understand how you process the text labels ? Usually, in an attention based text recognitizer, there will be "[GO]" and "[EOS]" label and will be converted into "num
class_id". But I don't understand in the transformer model, how the code process the labels into "num class_id" and generate the mask as bellow :

self.mask = subsequent_mask(self.transformer_size)

and your code has some difference in generating the mask from
https://github.com/jadore801120/attention-is-all-you-need-pytorch/blob/76762bb08225014fb3055a9d07f0043aba972d68/transformer/Models.py#L169

Do you have used "pad_idx", where can I find it ? what's the difference in use "pad_idx" and not use ? I'm really confused with "pad_idx", "GO_idx", "EOS_idx", how do you process that part ?
I don't quite know how to process with it. Could you give me some advice ?

MultiGPU

Is it possible to train on multi-gpus? Thanks!

cannot install without a GPU

Hi. It looks like there is a problem when I try to do the install on my MacBook with no GPU :

$ pip install -r requirements.txt
Collecting chainer==6.5.0
  Downloading https://files.pythonhosted.org/packages/1d/59/aa63339001ca8e15ebb560d0c33333ef465c479e165d967e64c7611b6e67/chainer-6.5.0.tar.gz (876kB)
     |████████████████████████████████| 880kB 508kB/s
Collecting chainercv==0.13.1
  Downloading https://files.pythonhosted.org/packages/e8/1c/1f267ccf5ebdf1f63f1812fa0d2d0e6e35f0d08f63d2dcdb1351b0e77d85/chainercv-0.13.1.tar.gz (260kB)
     |████████████████████████████████| 266kB 676kB/s
Collecting cupy==6.5.0
  Downloading https://files.pythonhosted.org/packages/67/4b/6960cdfeee8bbfa12450da6b83206b57f6d6951a74043f055905449bb657/cupy-6.5.0.tar.gz (3.1MB)
     |████████████████████████████████| 3.1MB 959kB/s
    ERROR: Command errored out with exit status 1:
     command: /Users/sebastienvincent/.virtualenvs/kiss/bin/python3.7 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/z_/yklxqshn4nv69bd4t63rsln40000gn/T/pip-install-yidh2_r0/cupy/setup.py'"'"'; __file__='"'"'/private/var/folders/z_/yklxqshn4nv69bd4t63rsln40000gn/T/pip-install-yidh2_r0/cupy/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /private/var/folders/z_/yklxqshn4nv69bd4t63rsln40000gn/T/pip-install-yidh2_r0/cupy/pip-egg-info
         cwd: /private/var/folders/z_/yklxqshn4nv69bd4t63rsln40000gn/T/pip-install-yidh2_r0/cupy/
    Complete output (46 lines):
    Options: {'package_name': 'cupy', 'long_description': None, 'wheel_libs': [], 'wheel_includes': [], 'no_rpath': False, 'profile': False, 'linetrace': False, 'annotate': False, 'no_cuda': False}

    -------- Configuring Module: cuda --------
    /var/folders/z_/yklxqshn4nv69bd4t63rsln40000gn/T/tmpde2uw1h1/a.cpp:1:10: fatal error: 'cublas_v2.h' file not found
    #include <cublas_v2.h>
             ^~~~~~~~~~~~~
    1 error generated.
    command 'gcc' failed with exit status 1

    ************************************************************
    * CuPy Configuration Summary                               *
    ************************************************************

    Build Environment:
      Include directories: []
      Library directories: []
      nvcc command       : (not found)

    Environment Variables:
      CFLAGS          : (none)
      LDFLAGS         : (none)
      LIBRARY_PATH    : (none)
      CUDA_PATH       : (none)
      NVTOOLSEXT_PATH : (none)
      NVCC            : (none)

    Modules:
      cuda      : No
        -> Include files not found: ['cublas_v2.h', 'cuda.h', 'cuda_profiler_api.h', 'cuda_runtime.h', 'cufft.h', 'curand.h', 'cusparse.h', 'nvrtc.h']
        -> Check your CFLAGS environment variable.

    ERROR: CUDA could not be found on your system.
    Please refer to the Installation Guide for details:
    https://docs-cupy.chainer.org/en/stable/install.html

    ************************************************************

    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/private/var/folders/z_/yklxqshn4nv69bd4t63rsln40000gn/T/pip-install-yidh2_r0/cupy/setup.py", line 132, in <module>
        ext_modules = cupy_setup_build.get_ext_modules()
      File "/private/var/folders/z_/yklxqshn4nv69bd4t63rsln40000gn/T/pip-install-yidh2_r0/cupy/cupy_setup_build.py", line 632, in get_ext_modules
        extensions = make_extensions(arg_options, compiler, use_cython)
      File "/private/var/folders/z_/yklxqshn4nv69bd4t63rsln40000gn/T/pip-install-yidh2_r0/cupy/cupy_setup_build.py", line 387, in make_extensions
        raise Exception('Your CUDA environment is invalid. '
    Exception: Your CUDA environment is invalid. Please check above error log.
    ----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

Is it possible to use kiss with CPU only?

Loss Functions

Hey again,
I had a few questions about the loss functions you used for the Localization net during training.

  • In the Out Of Image loss calculation you +/- 1.5 to the bbox instead of +/- 1 (like your paper), why do you do this?

  • Also why are you using corner coordinates for loss calculations?

  • Was the DirectionLoss used in your paper?

SVT Evaluation

The SVT test_crop=img folder contains 0 bite images and I am getting an error for that?

what is gt.mat?

I have both labels in gt texts and images of real-world data. How can I get the gt.mat or crop using crop_words_from_oxford.py

Use pretrained model and continue training on own data

Hi, thanks for this great project. Is it possible to use the pretrained model and continue training on a custom training set containing 5000 images (grey scale images with a dotted font). Do you think the results will be good?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.