Giter VIP home page Giter VIP logo

Comments (8)

AlexTo avatar AlexTo commented on June 18, 2024 1

hi @SiddhantSadangi

I'm using Neptune 1.9.1 and Pytorch Lightning 1.9.5 with ModelCheckpoint

They use the version property from NeptuneLogger as in this line to construct the checkpoint path.

However, I also just realized that NeptuneLogger is a wrapper class from Pytorch Lightning and not from Neptune.

Probably false alarm, thanks for your response. I think I need to look into their wrapper class to see where the bug comes from.

from neptune-client.

AlexTo avatar AlexTo commented on June 18, 2024 1

@SiddhantSadangi sorry for late reply, I've been busy the last few days upgrading my code to Lightning 2.2 which makes this issue no longer applicable to me as this happens with ModelCheckpoint in Lightning 1.9.x only.
Actually my workflow is the opposite, I disabled uploading model checkpoints to the cloud as my models are quite large. Also I don't need to store on the cloud because the checkpoints are mostly for local evaluation and deployment. Only configuration and results are needed to store on Neptune servers.

from neptune-client.

SiddhantSadangi avatar SiddhantSadangi commented on June 18, 2024 1

Perfect 🎉
Looks like the issue was on Lightning's end, not ours.

I am closing this thread, but please feel free to reach out if you need any further support 🤗

from neptune-client.

SiddhantSadangi avatar SiddhantSadangi commented on June 18, 2024

Hey @AlexTo 👋

Can you help me understand what the issue here is?

The folder created to store uploaded model checkpoints is always model/checkpoints, and is not related to run_short_id.
Also, the run ID is created only once the run has been initialized in the sync/async mode.

Is this not what you are expecting?

from neptune-client.

SiddhantSadangi avatar SiddhantSadangi commented on June 18, 2024

Thanks for the update!

As seen here, the path where model checkpoints are uploaded to Neptune is hardcoded to model/checkpoints, so you should not be seeing the checkpoints being uploaded to the None folder. Please let me know if this is the case though.

from neptune-client.

AlexTo avatar AlexTo commented on June 18, 2024

As mentioned above, I'm using ModelCheckpoint so I guess it is a bit different. From the code snippet in my comment, here is how the ModelCheckpoint construct the checkpoint path

image

So, for me, the folder created is like this

.neptune/model_name/version_None/checkpoints

because trainer.loggers[0].version which is NeptuneLogger.version returns None.

I'll debug the NeptuneLogger in the next 1 or 2 days and report here

from neptune-client.

SiddhantSadangi avatar SiddhantSadangi commented on June 18, 2024

Oh, you are referring to the local folder, not the folder created in the Neptune web app! Sorry for the confusion.

Could you share a code snippet for me to reproduce the issue?
I'd preferably need the snippets where you initialize ModelCheckpoint, NeptuneLogger, and Trainer

from neptune-client.

SiddhantSadangi avatar SiddhantSadangi commented on June 18, 2024

Also, if you are syncing the runs with the Neptune servers, should it really matter where the models are saved locally pending upload?
Just curious

from neptune-client.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.