cvignac / midi Goto Github PK
View Code? Open in Web Editor NEWMiDi: Mixed Graph and 3D Denoising Diffusion for Molecule Generation
License: MIT License
MiDi: Mixed Graph and 3D Denoising Diffusion for Molecule Generation
License: MIT License
Congrats on this very nice work!!
Are you planning to release the final weights of the model trained on the GEOM dataset?
Thanks,
Octavio
python3 main.py dataset=qm0 dataset.remove_h=True +experiment=qm9_no_h
This line is not even runnable. There is no dataset called qm0 and no experiment called qm9_no_h.
There are some typos I guess.
Hello Clément,
I'm trying to use your code for the QM9 dataset, but I have got this error :
...
File "D:\Users\antoine\Documents\Sorbonne\Stage_PS\CTDGG\src\datasets\abstract_dataset.py", line 23, in __getitem__
return self.dataloaders['train'][idx]
AttributeError: 'QM9DataModule' object has no attribute 'dataloaders'
It seems that the dataloaders
attribute is not defined anywhere, hence raising an error when the Mixin __getitem__
method is called. In Digress implementation, it was defined in the prepare_data
method which is not present anymore in this implementation. How do you create this attribute now ?
Note that I'm not using ray, nor ray-lightning packages for now, could it interfer with the dataloader creation ?
Nice update of Digress code btw, it's nice that you added multi-gpu support !
Best regards,
Antoine
Hi Clément,
When trying to use the testing procedure, I encountered a bug related to input_dims. It seems that in the current version of the code, they are updated twice : once in get_resume/load_from_checkpoint and once when the model is created.
It don't know exactly where it comes from but the following fix worked for me (at least for testing, haven't tried resuming) :
if cfg.general.test_only:
cfg, model = get_resume(cfg, dataset_infos, train_smiles, to_absolute_path(cfg.general.test_only), test=True)
elif cfg.general.resume is not None:
# When resuming, we can override some parts of previous configuration
print("Resuming from {}".format(to_absolute_path(cfg.general.resume)))
cfg, model = get_resume(cfg, dataset_infos, train_smiles, to_absolute_path(cfg.general.resume), test=False)
else:
model = DiffusionModel(cfg=cfg, dataset_infos=dataset_infos, train_smiles=train_smiles)
Also, it seems that all runs share the same seed when num_final_sampling is greater than 1, hence yielding the same results ...
Cheers,
Antoine
Hi, thanks for this very nice work and code.
I noticed this project does not have an explicit license. Could you please add a license stating the terms of use?
When setting up the environment with conda, pyg can not be installed. To solve this, when setting up the environment specify the RDKit version:
conda create -c conda-forge -n MoleculeDiffusion rdkit=2022.03.5 python=3.9
Alternatively @cvignac it might be worth adding an environment.yml so we can directly install the same package versions you are using.
MiDi/midi_src/datasets/adaptive_loader.py", line 6, in
from torch_geometric.data import LightningDataset
ImportError: cannot import name 'LightningDataset' from 'torch_geometric.data'
Hello,
I'm trying to use MiDi to generate molecules based on the model trained on GEOM with explicit H.
The trained model requires the dataset_infos as input, which needs the datamodule to get the statistics. However, I currently don't have enough RAM on my machine to load the training set of GEOM in the pickle file you provide.
I was thinking that probably having the processed files for the GEOMDrugsDataset could avoid the process() function (that is run when the processed files don't exist) and these files could be lighter than the whole pickle file containing molecules? Can you provide those ?
Or if you see another workaround (i.e. separating the statistics/configuration required for the dataset_infos in other files that do not always require the datamodule), please let me know?
Thank you very much,
Best,
Benoit
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.