Giter VIP home page Giter VIP logo

Comments (5)

sfalkena avatar sfalkena commented on August 27, 2024 1

Thanks for your fast answer. I am currently trying if maintaining the cache file on disk would work without slowing down too much. Otherwise I'll try a workaround for now by training the stages separately. If I have a bit more time in the future, I am happy to contribute to moving dataloading a level up.

from zoo.

jneeven avatar jneeven commented on August 27, 2024

Hi! ImageNet is indeed quite large, which is why we use multi-GPU training. Since each of the GPUs comes with several CPUs and 64GB of RAM on the compute platform we use, ImageNet does fit into memory if you use four GPUs in parallel. Unfortunately I don't think there is currently a clean way to cache only part of a dataset; you could try to split the dataset up and get creative with tf.data.Dataset.concatenate and then do the caching before concatenating both parts of the dataset, but I doubt this would be very efficient either way. Not caching the dataset is probably your best solution (although it is unfortunately a bit slow). Good luck!

from zoo.

AdamHillier avatar AdamHillier commented on August 27, 2024

Just to add, if you're on a single-GPU machine, disabling caching shouldn't have too much of an impact, especially if your dataset is stored on fast memory e.g. an SSD.

from zoo.

sfalkena avatar sfalkena commented on August 27, 2024

Hi, I have an additional question about the caching of Imagenet. I have the possibility to configure my training setup with enough RAM for caching Imagenet. However, when I run a multistage experiment, I am experiencing an increase in RAM for the second stage. I think the problem here lays in the fact that each TrainLarqZooModel caches the dataset again. Are you aware of any way to reuse the dataset across stages or on how to release the dataset from RAM? Perhaps it would make sense to move dataloading one level up so that it gets called once per experiment, regardless if that is a single or multistage experiment.

from zoo.

jneeven avatar jneeven commented on August 27, 2024

Hi @sfalkena, I have indeed run into this problem before and didn't realise it would apply here as well. Unfortunately I have not found any robust way to remove the dataset from RAM, so your suggestion of moving the dataset one level up sounds like the best approach. In my case I wasn't using ImageNet and I was only running slightly over memory limits, so it was feasible to just slightly increase RAM, but that is of course not a maintainable solution especially at ImageNet scale...

from zoo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.