Giter VIP home page Giter VIP logo

sparsediff's Introduction

Sparse denoising diffusion for large graph generation

Official code for the paper, "Sparse Training of Discrete Diffusion Models for Graph Generation," available here.

Checkpoints to reproduce the results can be found at this link. Please refer to the updated version of our paper on arXiv.

Environment installation

This code was tested with PyTorch 2.0.1, cuda 11.8 and torch_geometrics 2.3.1

  • Download anaconda/miniconda if needed

  • Create a rdkit environment that directly contains rdkit:

    conda create -c conda-forge -n sparse rdkit=2023.03.2 python=3.9

  • conda activate sparse

  • Check that this line does not return an error:

    python3 -c 'from rdkit import Chem'

  • Install graph-tool (https://graph-tool.skewed.de/):

    conda install -c conda-forge graph-tool=2.45

  • Check that this line does not return an error:

    python3 -c 'import graph_tool as gt'

  • Install the nvcc drivers for your cuda version. For example:

    conda install -c "nvidia/label/cuda-11.8.0" cuda

  • Install a corresponding version of pytorch, for example:

    pip3 install torch==2.0.1 --index-url https://download.pytorch.org/whl/cu118

  • Install other packages using the requirement file:

    pip install -r requirements.txt

  • Install mini-moses:

    pip install git+https://github.com/igor-krawczuk/mini-moses

  • Run:

    pip install -e .

  • Navigate to the ./sparse_diffusion/analysis/orca directory and compile orca.cpp:

    g++ -O2 -std=c++11 -o orca orca.cpp

Run the code

  • All code is currently launched through python3 main.py. Check hydra documentation (https://hydra.cc/) for overriding default parameters.
  • To run the debugging code: python3 main.py +experiment=debug.yaml. We advise to try to run the debug mode first before launching full experiments.
  • To run a code on only a few batches: python3 main.py general.name=test.
  • You can specify the dataset with python3 main.py dataset=guacamol. Look at configs/dataset for the list of datasets that are currently available
  • You can specify the edge fraction (denoted as $\lambda$ in the paper) with python3 main.py model.edge_fraction=0.2 to control the GPU-usage

Cite the paper

@misc{qin2023sparse,
      title={Sparse Training of Discrete Diffusion Models for Graph Generation}, 
      author={Yiming Qin and Clement Vignac and Pascal Frossard},
      year={2023},
      eprint={2311.02142},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Troubleshooting

PermissionError: [Errno 13] Permission denied: 'SparseDiff/sparse_diffusion/analysis/orca/orca': You probably did not compile orca.

sparsediff's People

Contributors

qym7 avatar cvignac avatar

Stargazers

hina avatar  avatar Jeff Carpenter avatar Enmin Zhu avatar ZhiyuanthePony avatar  avatar Citrusy avatar Richard HD avatar Tianyang Xu avatar Nian Liu avatar  avatar liheng avatar  avatar savoki avatar Manuel Madeira avatar ruijie2001 avatar Sevda Öğüt avatar  avatar  avatar 犭苗 avatar Yoshitaka Inoue avatar FJDEV avatar Jed Homer avatar  avatar Yeco avatar Edoardo De Matteis avatar Baran Hashemi avatar Aoran Wang avatar

Watchers

 avatar Edoardo De Matteis avatar  avatar

sparsediff's Issues

Assistance Required with HTTP Error 403 when Downloading QM9 Dataset in SparseDiff Project

Dear Author,

I hope this message finds you well. I am currently working with your SparseDiff project and have encountered a challenge that I believe requires your expertise. I am experiencing an HTTP 403 Forbidden error when attempting to download the QM9 dataset.

The error trace is as follows:

Error executing job with overrides: ['dataset=qm9']
Traceback (most recent call last):
...
urllib.error.HTTPError: HTTP Error 403: Forbidden
I have verified that the file paths are correct and have followed the code implementation you provided. Specifically, the error occurs when executing the following code snippet:

file_path = download_url(self.raw_url, self.raw_dir)
extract_zip(file_path, self.raw_dir)
os.unlink(file_path)
_ = download_url(self.raw_url2, self.raw_dir)
os.rename(
osp.join(self.raw_dir, "3195404"),
osp.join(self.raw_dir, "uncharacterized.txt"),
)

Diagonal mask in Laplacian computation

Hi,

Congrats for this nice work and thanks for sharing your code.

I'm a bit confused with something you do in the computation of the of the Laplacian :

L = self.compute_laplacian(A, normalize=False)
mask_diag = 2 * L.shape[-1] * torch.eye(A.shape[-1]).type_as(L).unsqueeze(0)
mask_diag = mask_diag * (~mask.unsqueeze(1)) * (~mask.unsqueeze(2))
L = L * mask.unsqueeze(1) * mask.unsqueeze(2) + mask_diag

Could you explain me why you add this mask to the diagonal of the Laplacian ?

Best,
Antoine

AttributeError: module 'sparse_diffusion.utils' has no attribute 'densify_noisy_data'

When I run the command ‘python3 main.py general.name=test’, the console displays an error message:

Error executing job with overrides: ['general.name=test']
Traceback (most recent call last):
File "/Users/wzn/code/SparseDiff/sparse_diffusion/main.py", line 121, in main
dataset_infos.compute_input_dims(
File "/Users/wzn/code/SparseDiff/sparse_diffusion/datasets/abstract_dataset.py", line 204, in compute_input_dims
ex_extra_feat = extra_features(example_data)
File "/Users/wzn/code/SparseDiff/sparse_diffusion/diffusion/extra_features.py", line 52, in call
noisy_data = utils.densify_noisy_data(sparse_noisy_data)
AttributeError: module 'sparse_diffusion.utils' has no attribute 'densify_noisy_data'

And I didn't find that method in utils.py.
How should I handle it to ensure that the code can run properly?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.