Giter VIP home page Giter VIP logo

Comments (6)

pvoigtlaender avatar pvoigtlaender commented on July 17, 2024

Hi,

which kind of model do you use for training? Note, that the data layout used in the write_to_hdf function from demos/mdlstm/IAM/create_IAM_dataset.py is only suitable for a 2D LSTM network, while the 1D networks use a different layout.

Do the demos work for you? If yes, then you should try to stick as close as possible to the way the demo creates the data.

Also note, that in the example, the data is put under the "inputs" key and not under the "data" key (although I'm not sure, if this matters)

Please also have a look at https://github.com/rwth-i6/returnn/blob/master/demos/mdlstm/artificial/create_test_h5.py which is a very simple script which shows how to properly create a data file for a 2D LSTM network.

If this still does not work for you, please let us now.

from returnn.

mmedinajiem avatar mmedinajiem commented on July 17, 2024

Thank you for your reply!

I'm using a model trained with the demo included in the code (demos/mdlstm/IAM/go.sh). The data I want to send are just images from the IAM dataset, so it's 2D.

Yeah. The demos work. I'm creating the data pretty much in the same way it's being done in the code.

What I'm trying to do is to setup a demo on a web service, where you can load a trained model and send it data to recognize, get the result and show it on a web page. The issue I'm having is that when I send just one image, I get result a result, decode it and it's fine, but when sending more than one I get a long sequence, and when I decode it, it's gibberish. I want to send a request with more than one image at the same time, if possible.

Working on a AWS instance (2GB GPU), If I send three images to the daemon, it crashes:

python2.7: mod.cu:3443: int _GLOBAL__N__38_tmpxft_00005d36_00000000_9_mod_cpp1_ii_5bcebdd5::__struct_compiled_op_b7d1f699ec8aa72531b9afc40db7fbc6::run(): Assertion `V49' failed.
Fatal Python error: Aborted

Current thread 0x00007f8738f30740 (most recent call first):
  File "/home/mmedina/ReturRNN/dlenv/local/lib/python2.7/site-packages/theano/compile/function_module.py", line 859 in __call__
  File "/home/mmedina/ReturRNN/returnn/Device.py", line 759 in compute_run
  File "/home/mmedina/ReturRNN/returnn/Device.py", line 1034 in process_inner
  File "/home/mmedina/ReturRNN/returnn/Device.py", line 887 in process
  File "/home/mmedina/ReturRNN/returnn/TaskSystem.py", line 1195 in _asyncCall
  File "/home/mmedina/ReturRNN/returnn/TaskSystem.py", line 470 in funcCall
  File "/home/mmedina/ReturRNN/returnn/TaskSystem.py", line 957 in checkExec
  File "/home/mmedina/ReturRNN/returnn/TaskSystem.py", line 1304 in <module>
Dev gpu0 proc died: recv_bytes EOFError: 
device crashed on batch 0

If I do the same on a local server with 3 Titan X GPUs (36GB RAM) it does not crash, but I only get a long sequence as result, and when decoding it I just get nonsense. Judging by these factors, my guess is that somehow the library thinks that the data contained in "data" is just one image and processes it as one single sequence.

This is the code I use to create the data and the JSON I pass to the daemon:

def normalize_image(imgfile, pad_x=15, pad_y=15):
    img = imread(imgfile)
    img = 255 - img;
    img = np.pad(img, ((pad_y, pad_y), (pad_x, pad_x)), 'constant')
    padded_shape = img.shape
    img = img.reshape(img.size, 1)
    img = img.astype("float32") / 255.0
    return img, padded_shape


def build_json_from_file(imgfile):
    data_structure = {}
    data_structure['classes'] = [79,1]
    data_structure['data'] = []
    data_structure['sizes'] = []

    imgs = []
    padded_shapes = []

    with open(imgfile, 'r') as f:
        for image in f.read().splitlines():
            img, padded_shape = normalize_image(image, 15, False)            
            imgs.append(img)
            padded_shapes.append(padded_shape)

    imgs = np.concatenate(imgs, axis=0)
    print np.array(imgs).shape
    padded_shapes = np.concatenate(padded_shapes, axis=0)

    data_structure['sizes'] += padded_shapes.tolist()
    
    for img in imgs:        
        as_list = img.tolist()
        data_structure['data'].append(as_list)

    return data_structure

If sending to the daemon multiple images in one request is not possible, I was thinking on receiving the images in a generic request, create an .h5 file with it, fire up ./rnn.py with a custom configuration file that includes the path of the created .h5 file, and then, when finished, get the results somehow, create a response and send it back to the caller, but I think it's too much.

Please let me know if I'm not clear enough. I've been stuck on this issue for a couple of days now and I may be missing or omitting something.

I really appreciate your help.

Thanks!

from returnn.

pvoigtlaender avatar pvoigtlaender commented on July 17, 2024

Hi,

first of all, the the error "Assertion `V49' failed." indicates out of GPU memory (sorry for the unspecific error message there, we should improve this).

And yes, it should be possible to forward multiple images at the same time.

You said, that the demo for training works. Does it also work, when you use the demo data for forwarding?

So far I wasn't able to see, where the problem comes from. Can you please send me the config and a small h5 data file you are using to p.voigtlaender [at] gmail.com ?

Edit: Please note that when forwarding to hdf, the result is stored as one long sequence, which has to be splitted using the seqLengths, so that "it does not crash, but I only get a long sequence as result" is expected, however the result should not be nonsense but the concatenation of the contents of both images in this case.
Btw, the daemon you are using is an experimental and undocumented feature. Maybe first try to get everything working with a "normal" forwarding to hdf5, so we can isolate the problem

from returnn.

mmedinajiem avatar mmedinajiem commented on July 17, 2024

Thanks again for your reply. Really appreciate that you're taking time to do this.

I have not tried forwarding. I'll send you the config file. Also, I'm not using any h5 file so far. I read the image from disk, convert it in the format the JSON expects (based on how you prepare the data for writing in an h5 file in the code demos/mdlstm/IAM/create_IAM_dataset.py I'll send you the full program I'm using so you can have a better look at what I'm trying to achieve.

About the daemon: I understand. I found it while studying the code and thought it was the easiest option to get something working.

from returnn.

doetsch avatar doetsch commented on July 17, 2024

Hi Manuel,

So you are currently using the daemon functionality within RETURNN as defined in Engine.py? This is a very experimental feature which so far was only used in a toy chat bot experiment.

There might be bugs, but in general you should be able to use it. Each call to classify only accepts a single sequence for now, but you can simply make multiple classify requests asynchronously and remember the hashes it returns and ask for them in any order to get the results (or a message that they are not done yet). There is no need to wait for previous results before making new requests.

If there are more performance requirements then I can look into extending the server to support batches of sequences.

from returnn.

mmedinajiem avatar mmedinajiem commented on July 17, 2024

from returnn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.