Hello, I am a beginner in reinforcement learning. Thank you very much for providing me with such a library for easy learning. As the step size of each of my epochs is not fixed, I would like to know how to implement it if I want to train and record on an episode basis. If this is not easy to implement, I would like to know if training on steps would cause any problems and what content is being recorded at this time? Is it the average of several episodes or something? Thank you!
Hello, this is the issue for file 2.1. Since I ran the program with the CPU, not CUDA, I changed the default to the CPU in the main dvc, and the model has been successfully trained.
But when loading the second image, I get "raise RuntimeError('Attempting to deserialize object on a CUDA'RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.” I don't know how to modify it
I hope this message finds you well. I am a PhD student from SJTU. A friend of mine recommended this repository to me. I noticed similarities between the tutorial I am currently developing and yours.
I am in the process of creating a Reinforcement Learning (RL) tutorial that aims to provide a comprehensive resource with both code examples and in-depth mechanism explanations. You can find the initial codebase for my tutorial at this repository: https://github.com/SCP-CN-001/RL101. At present, it appears that both of us have completed the coding segment of our respective tutorials.
I am reaching out to gauge your interest in collaborating on the documentation aspect of these tutorials. If you find merit in the idea of combining our efforts to enhance the educational value of our materials, I believe we can create a more comprehensive and impactful resource.
If this proposal intrigues you, please feel free to reach out to me via the email address provided on my GitHub profile: https://github.com/SCP-CN-001.
Looking forward to the possibility of collaborating with you on this endeavor.
Hi, I'm having problems with action prediction with TD3. The agent tends to predict boundary actions after it starts learning. Do you know what is causing this problem?