Windows 10, CPU
To set up the environment, open the Anaconda Prompt and run the following commands:
conda create -n qutip_RL python=3.9
conda env list
conda activate qutip_RL
pip install -r requirements.txt
-
Put all files from the directory
.\cust_env\classical_control\
into thequtip_RL
environment directory atC:\users\yourUserName\anaconda3\envs\qutip_RL\Lib\site-packages\gym\envs\classic_control\
, replacing the original files. -
Copy the file from
.\cust_env\__init__.py
and paste it intoC:\users\yourUserName\anaconda3\envs\qutip_RL\Lib\site-packages\gym\envs\__init__.py
, replacing the original file.
Open spyder by running the following command in the Anaconda Prompt
spyder
-
Files with the suffix
fig_plot
are used for plotting figures. -
Files with the suffix
fig_code
are used for plotting partial figures. -
Files with the suffix
fig_data
are used for generating data for figures.
-
When running the training code (e.g.,
training_fig2_data.py
), you can copy it into a new directory named code_test. In this directory, you can try reducing the training load by setting parameters such asn_episode = 10
,n_steps = 10
,n_update = 2
, andoutput_interval = 2
. This will allow you to quickly test the code. -
The testing process in the code (e.g.,
test_ave_fig2_code.py
) has the testing function (e.g.,PPOtest
) commented out and saves all testing data. You can simply plot all results by running such codes (e.g.,test_ave_fig2_code.py
) directly.
The complete training results from Figures 4 to 8 and Figures S1 to S2 have been shared on Zenodo: [https://doi.org/10.5281/zenodo.12584159]
-
Stable-baselines3 for the PPO agent: [https://stable-baselines3.readthedocs.io/en/master/index.html]
-
Sb3-contrib for the recurrent PPO agent: [https://sb3-contrib.readthedocs.io/en/master/index.html]