Giter VIP home page Giter VIP logo

midi's People

Contributors

cvignac avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

midi's Issues

Missing "dataloaders" attribute in datamodule

Hello Clément,

I'm trying to use your code for the QM9 dataset, but I have got this error :

                                                             ...
File "D:\Users\antoine\Documents\Sorbonne\Stage_PS\CTDGG\src\datasets\abstract_dataset.py", line 23, in __getitem__
    return self.dataloaders['train'][idx]
AttributeError: 'QM9DataModule' object has no attribute 'dataloaders'

It seems that the dataloaders attribute is not defined anywhere, hence raising an error when the Mixin __getitem__ method is called. In Digress implementation, it was defined in the prepare_data method which is not present anymore in this implementation. How do you create this attribute now ?

Note that I'm not using ray, nor ray-lightning packages for now, could it interfer with the dataloader creation ?

Nice update of Digress code btw, it's nice that you added multi-gpu support !

Best regards,

Antoine

Problems in resuming/testing + same seed for all test runs

Hi Clément,

When trying to use the testing procedure, I encountered a bug related to input_dims. It seems that in the current version of the code, they are updated twice : once in get_resume/load_from_checkpoint and once when the model is created.

It don't know exactly where it comes from but the following fix worked for me (at least for testing, haven't tried resuming) :

    if cfg.general.test_only: 
        cfg, model = get_resume(cfg, dataset_infos, train_smiles, to_absolute_path(cfg.general.test_only), test=True) 
    elif cfg.general.resume is not None: 
        # When resuming, we can override some parts of previous configuration 
        print("Resuming from {}".format(to_absolute_path(cfg.general.resume))) 
        cfg, model = get_resume(cfg, dataset_infos, train_smiles, to_absolute_path(cfg.general.resume), test=False) 
    else: 
        model = DiffusionModel(cfg=cfg, dataset_infos=dataset_infos, train_smiles=train_smiles) 

Also, it seems that all runs share the same seed when num_final_sampling is greater than 1, hence yielding the same results ...

Cheers,

Antoine

Please add a license

Hi, thanks for this very nice work and code.

I noticed this project does not have an explicit license. Could you please add a license stating the terms of use?

pyg incompatible with rdkit version 2023

When setting up the environment with conda, pyg can not be installed. To solve this, when setting up the environment specify the RDKit version:

conda create -c conda-forge -n MoleculeDiffusion rdkit=2022.03.5 python=3.9

Alternatively @cvignac it might be worth adding an environment.yml so we can directly install the same package versions you are using.

Is is possible to release the GEOMDrugsDataset processed files ?

Hello,

I'm trying to use MiDi to generate molecules based on the model trained on GEOM with explicit H.
The trained model requires the dataset_infos as input, which needs the datamodule to get the statistics. However, I currently don't have enough RAM on my machine to load the training set of GEOM in the pickle file you provide.
I was thinking that probably having the processed files for the GEOMDrugsDataset could avoid the process() function (that is run when the processed files don't exist) and these files could be lighter than the whole pickle file containing molecules? Can you provide those ?
Or if you see another workaround (i.e. separating the statistics/configuration required for the dataset_infos in other files that do not always require the datamodule), please let me know?

Thank you very much,

Best,
Benoit

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.