Giter VIP home page Giter VIP logo

Comments (10)

doetsch avatar doetsch commented on July 17, 2024

Unfortunately I am not able to reproduce the error. Are you using the most recent commit? You can also try to deactivate caching by setting cache_size to "8G".

from returnn.

pvoigtlaender avatar pvoigtlaender commented on July 17, 2024

I was able to reproduce it:

ValueError: could not broadcast input array from shape (151211,1) into shape (107131,1)

KeyboardInterrupt
train epoch 2, batch 191, cost:output 2.81097157796, elapsed 0:07:17, exp. remaining 1:44:52, complete 6.49%
1:44:52 [||||||||||||| 6.49%

But I don't know yet, what is the problem here
edit: setting setting cache_size to "8G" did not help, and it loads the data anyway:
1:47:03 [|||||||||| 4.81% ]running 2 sequence slices (473110 nts) of batch 141 on device gpu0
train epoch 2, batch 141, cost:output 3.09054326076, elapsed 0:05:26, exp. remaining 1:47:47, complete 4.81%
1:47:47 [|||||||||| 4.81% ]loading file features/raw/train.2.h5
running 2 sequence slices (463386 nts) of batch 142 on device gpu0
loading file features/raw/train.1.h5
TaskThread train failed
Unhandled exception <type 'exceptions.AssertionError'> in thread <TrainTaskThread(TaskThread train, started daemon 140624219232000)>, proc 23277.

from returnn.

pvoigtlaender avatar pvoigtlaender commented on July 17, 2024

for a quick fix you could try to put all the data into one file instead of two, although this does not solve the actual issue ofcourse.
You can also try an older commit. The demo used to work in earlier commits. If you can find out, which commit broke it, then it might be easy to fix it

from returnn.

cwig avatar cwig commented on July 17, 2024

Thanks for looking into this. I did try putting all the training data in one file and I still had the issue. I modified the create_IAM_dataset.py file on line 203-209.

I'll try an older commit.

from returnn.

cwig avatar cwig commented on July 17, 2024

This didn't solve the actual problem, but it worked when I reverted back to commit 82be088

from returnn.

pvoigtlaender avatar pvoigtlaender commented on July 17, 2024

Is this the last commit which works? It would be very helpful to find it, so we can see which change was the problem.

from returnn.

cwig avatar cwig commented on July 17, 2024

I'm not sure. I have only tried two so far. a925c7a did not work so it is somewhere between a925c7a and 82be088.

from returnn.

doetsch avatar doetsch commented on July 17, 2024

Could you try the most recent commit? There seems to be an issue with the cache size calculation on some few machines and it took me a while to reproduce it. Hard coding it to 16GB in config_real as done by commit 2d1744c resolved the issue for me on this machine.

from returnn.

pvoigtlaender avatar pvoigtlaender commented on July 17, 2024

With cache_size set to 256G (as in the latest version in the repository) it works with the latest commit now. There seems to be a problem with the size calculation for two-dimensional data. So for now just set the cache_size real high

from returnn.

doetsch avatar doetsch commented on July 17, 2024

The fix has been confirmed on three independent machines. Therefore I am closing this issue.

from returnn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.