Giter VIP home page Giter VIP logo

pydpm's Introduction



GitHub PyPI Python version Documentation Status Stars Downloads Contributing

A python library focuses on constructing Deep Probabilistic Models (DPMs). Our developed Pydpm not only provides efficient distribution sampling functions on GPU, but also has included the implementations of existing popular DPMs.

Documentation | Paper [Arxiv] | Tutorials | Benchmarks | Examples |

News

🔥A new version that does not depend on Pycuda has been released.

🔥An abundance of professional learning materials on Deep Generative Models from the Ermom's group at Stanford University. (CS236 - Fall 2021)

🔥A tutorial of DPMs has been uploaded by Prof. Wilker Aziz (University of Amsterdam).

Install

The current version of PyDPM can be installed under either Windows or Linux system with PyPI.

$ pip install pydpm

For Windows system, we recommed to install Visual Studio 2019 as the compiler equipped with CUDA 11.5 toolkit; For Linux system, we recommed to install the latest version of CUDA toolkit.

The enviroment for testing has been released for easily reproducing our results.

$ conda env create -f enviroment.yaml

Overview

The overview of the framework of PyDPM library can be roughly split into four sectors, specifically Sampler, Model, Evaluation, and Example modules, which have been illustrated as follows:

  1. Sampler module includes both parts of the basic Distribution Sampler and the sophisticate Model Sampler, which can effectively accomplish the sampling requirements of these DPMs constructed on either CPU or GPU;
  2. Model module contains a wide variety of classical and popular DPMs, which can be directly called as APIs in Python;
  3. Evaluation module provides a DataLoader sub-module to process data samples in various forms, such as images, text, graphs etc., and also a Metric sub-module to comprehensively evaluate these DPMs after training;
  4. Example module, for each DPM included in the Model module, we provides a corresponding code demo equipped with a detailed explanation in the official docs.

The workflow of applying PyDPM for downstream tasks, which can be splited into four steps as follows:

  1. Device deployment of pyDPM can be choose as a platform with either CPU or GPU;
  2. Mechasnisms of model training or testing includes either or both of Gibbs sampling and back propagation, implemented by pyDPM.sampler and pyTorch respecitveily;
  3. Model categories in pyDPM mainly include Bayesian Probabilistic Model, Deep-Learning Probabilistic Models, and Hybrid Probabilistic Models;
  4. Applications of DPMs has included Nature Language Processing (NLP), Graph Neural Network (GNN), and Recommendation System (RS) etc.

Model List

The Model module in pyDPM has included a wide variety of popular DPMs, which can be roughly split into several categories, including Bayesian Probabilistic Model, Deep-Learning Probabilistic Models, and Hybrid Probabilistic Models.

Bayesian Probabilistic Models

      Probabilistic Model Name       Abbreviation    Paper Link   
Latent Dirichlet Allocation LDA Blei et al., 2003
Poisson Factor Analysis PFA Zhou et al., 2012
Poisson Gamma Belief Network PGBN Zhou et al., 2015
Convolutional Poisson Factor Analysis CPFA Wang et al., 2019
Convolutional Poisson Gamma Belief Network CPGBN Wang et al., 2019
Factor Analysis FA
Gaussian Mixed Model GMM
Poisson Gamma Dynamical Systems PGDS Zhou et al., 2016
Deep Poisson Gamma Dynamical Systems DPGDS Guo et al., 2018
Dirichlet Belief Networks DirBN Zhao et al., 2018
Deep Poisson Factor Analysis DPFA Gan et al., 2015
Word Embeddings Deep Topic Model WEDTM Zhao et al., 2018
Multimodal Poisson Gamma Belief Network MPGBN Wang et al., 2018
Graph Poisson Gamma Belief Network GPGBN Wang et al., 2020

Deep-Learning Probabilistic Models

      Probabilistic Model Name       Abbreviation    Paper Link   
Restricted Boltzmann Machines RBM Hinton et al., 2010
Variational Autoencoder VAE Kingma et al., 2014
Generative Adversarial Network GAN Goodfellow et al., 2014
Density estimation using Real NVP RealNVP (2d) Dinh et al., 2017
Denoising Diffusion Probabilistic Models DDPM Ho et al., 2020
Density estimation using Real NVP RealNVP (image) Dinh et al., 2018
Conditional Variational Autoencoder CVAE Sohn et al., 2015
Deep Convolutional Generative Adversarial Networks DCGAN Radford et al., 2016
Wasserstein Generative Adversarial Networks WGAN Arjovsky et al., 2017
Information Maximizing Generative Adversarial Nets InfoGAN Xi Chen et al., 2016

Hybrid Probabilistic Models

      Probabilistic Model Name       Abbreviation    Paper Link   
Weibull Hybrid Autoencoding Inference WHAI Zhang et al., 2018
Weibull Graph Attention Autoencoder WGAAE Wang et al., 2020
Recurrent Gamma Belief Network rGBN Guo et al., 2020
Multimodal Weibull Variational Autoencoder MWVAE Wang et al., 2020
Sawtooth Embedding Topic Model SawETM Duan et al., 2021
TopicNet TopicNet Duan et al., 2021
Deep Coupling Embedding Topic Model dc-ETM Li et al., 2022
Topic Taxonomy Mining with Hyperbolic Embedding HyperMiner Xu et al., 2022
Knowledge Graph Embedding Topic Model KG-ETM Wang et al., 2022
Variational Edge Parition Model VEPM He et al., 2022
Generative Text Convolutional Neural Network GTCNN Wang et al., 2022

Deep Proabilistic Models planned to be built

🔥Welcome to introduce classical or novel Deep Proabilistic Models for us.

      Probabilistic Model Name       Abbreviation    Paper Link   
Nouveau Variational Autoencoder NVAE Vahdat et al., 2020
flow-based Variational Autoencoder f-VAE Su et al., 2018
Score-Based Generative Models SGM Bortoli et al., 2022
Poisson Flow Generative Models PFGM Xu et al., 2022
Stable Diffusion LDM Rombach et al., 2022
Denoising Diffusion Implicit Models DDIM Song et al., 2022
Vector Quantized Diffusion VQ-Diffusion Tang et al., 2023
Vector Quantized Variational Autoencoder VQ-VAE Aaron van den Oord et al., 2017
Conditional Generative Adversarial Nets cGAN Mirza et al., 2014
Information Maximizing Variational Autoencoders InfoVAE zhao et al.,2017
Generative Flow Glow Kingama et al., 2018
Structured Denoising Diffusion Models in Discrete State-Spaces DP3M Austin et al., 2021

Usage

Example: a few code lines to quickly construct and evaluate a 3-layer Bayesian model named PGBN on GPU.

from pydpm.model import PGBN
from pydpm.metric import ACC

# create the model and deploy it on gpu or cpu
model = PGBN([128, 64, 32], device='gpu')
model.initial(train_data)
train_local_params = model.train(train_data, iter_all=100)
train_local_params = model.test(train_data, iter_all=100)
test_local_params = model.test(test_data, iter_all=100)

# evaluate the model with classification accuracy
# the demo accuracy can achieve 0.8549
results = ACC(train_local_params.Theta[0], test_local_params.Theta[0], train_label, test_label, 'SVM')

# save the model after training
model.save()

Example: a few code lines to quickly deploy distribution sampler of Pydpm on GPU.

from pydpm.sampler import Basic_Sampler

sampler = Basic_Sampler('gpu')
a = sampler.gamma(np.ones(100)*5, 1, times=10)
b = sampler.gamma(np.ones([100, 100])*5, 1, times=10)

Compare

Compare the distribution sampling efficiency of PyDPM with numpy:

Compare the distribution sampling efficiency of PyDPM with tensorflow and torch:

Compare the distribution sampling efficiency of PyDPM with CuPy and PyCUDA(used by pydpm v1.0):

Contact

License: Apache License Version 2.0

Contact: Chaojie Wang [email protected], Wei Zhao [email protected], Xinyang Liu [email protected], Bufeng Ge [email protected], Jiawen Wu [email protected]

Copyright (c), 2020, Chaojie Wang, Wei Zhao, Xinyang Liu, Jiawen Wu, Jie Ren, Yewen Li, Hao Zhang, Bo Chen and Mingyuan Zhou

pydpm's People

Contributors

bochengroup avatar chaojiewang94 avatar dustone-mu avatar xd-wjw avatar xinyangatk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pydpm's Issues

sampler_kernel_win.cu FileNotFoundError: Could not find module site-packages '...\pydpm\_sampler\_compact\sampler_kernel.dll' (or one of its dependencies). Try using the full path with constructor syntax.

How to fix the error below?
ptxas fatal : Unresolved extern function '_Z3powfi' Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\Anaconda3\envs\mmdet\lib\site-packages\pydpm\_model\_pgbn.py", line 52, in __init__
self._sampler = Basic_Sampler(self._model_setting.device) File "D:\Anaconda3\envs\mmdet\lib\site-packages\pydpm\_sampler\_basic_sampler.py", line 37, in __init__ self._gpu_sampler_initial()
File "D:\Anaconda3\envs\mmdet\lib\site-packages\pydpm\_sampler\_basic_sampler.py", line 63, in _gpu_sampler_initial
sampler = distribution_sampler_gpu(self.system_type)
File "D:\Anaconda3\envs\mmdet\lib\site-packages\pydpm\_sampler\_distribution_sampler_gpu.py", line 56, in __init__
dll = ctypes.cdll.LoadLibrary(compact_path)
File "D:\Anaconda3\envs\mmdet\lib\ctypes\__init__.py", line 451, in LoadLibrary
return self._dlltype(name) File "D:\Anaconda3\envs\mmdet\lib\ctypes\__init__.py", line 373, in __init__
self._handle = _dlopen(self._name, mode)
FileNotFoundError: Could not find module 'D:\Anaconda3\envs\mmdet\lib\site-packages\pydpm\_sampler\_compact\sampler_kernel.dll' (or one of its dependencies). Try using the full path with constructor syntax.

when running
from pydpm._model import PGBN
model = PGBN([128,64,32], device='gpu')
Thanks.

Defination of metric

Would you please provide any definations of metric for evaluation in your repository?

Thanks.

HelloWorld

沙发。
windows版本啥时候上线?

compile sampler library remotely on AWS

thanks for the pre q&a about recompilelation, and I still wonder how to compile sampler library files manually on my server. want to depoly this project on AWS

/bin/sh: 1: nvcc: not found

I got an error when I run the sampler in my ubuntu PC:
'''
/bin/sh: 1: nvcc: not found
File "/PyDPM4.0.1/pydpm/sampler/distribution_sampler_gpu.py", line 99, in init
dll = ctypes.cdll.LoadLibrary(compact_path)
File "/anaconda3/envs/pydpm/lib/python3.6/ctypes/init.py", line 426, in LoadLibrary
return self._dlltype(name)
File "anaconda3/envs/pydpm/lib/python3.6/ctypes/init.py", line 348, in init
self._handle = _dlopen(self._name, mode)
OSError:/PyDPM4.0.1/pydpm/sampler/_compact/distribution_sampler.so: cannot open shared object file: No such file or directory
'''
I'm green to this, how can I deal with it
Thank you

Some recent work about topic mdoels

I find some intereseting work published in recent top conference. Can your guys include these projects into this library for convience?

The work list:
Alleviating “Posterior Collapse” in Deep Topic Models via Policy Gradient", NeurIPS 2022
HyperMiner: Topic Taxonomy Mining with Hyperbolic Embedding , NeurIPS 2022
Knowledge-Aware Bayesian Deep Topic Model, NeurIPS2022

how could I use the sampler module to build my own PRNGs?

Thanks for your contributions to the cuda based sampler module. I want to build some new distributions' PRNG by this sampler, and find out there are different files in sampler/_compact. wanna to know where and how to edit them for new PRNG

Requiests for adding new diffusion models

It is great to see u to start to increase the variants of diffusion models. I wonder if you have a plan to include more recent popular diffusion models in your project, just like

Denoising Diffusion Implicit Models. NIPS 2022
Improved Vector Quantized Diffusion Models
. CVPR 2023

These two are widely treated as baselines in today's reseach

Thanks

About metrics of GAN

I see that other models have some good API, but I can't find a suitable metric to evaluate the quality of the “Generative Adversarial Networks.” Would you please add an evaluation metric for this model, I think it will greatly improve the flexibility of the application.
Thanks.

Filed installation of pycuda

I get errors during installing the denpendency of pycuda:
ERROR: Could not build wheels for pycuda, which is required to install pyproject.toml-based projects
from pycuda._driver import * # noqa
ImportError: DLL load failed:
Those errors occurred during the installation by pip. I hava changed 2 device to install pydpm, but It doesn't work both : (
The environment of my win PC is as list:
python 3.6
cuda 10.2
win 10

how to recompile this project?

There are some problems about the sampler. and I suspect that these errors are caused by the compiled files. I want to recompile those cuda file. so which files are needed for recompilation and how can I do that? tks : )

Some questions about dataset used in your demo

Hi

Thanks for your efforts for deep generative models. However, I find that the datasets used in your demos are not public datasets. Can you replace the datasets in your code with those in torchvision/torchtext etc, which will be more convenient for us to extend the models in your libarary.

Thanks

About Gaussian Process

Thanks for your efforts in summarizing Bayesian models.

I am wondering if you would like to include the widely used Gaussian Process in your library. Because some recent studies of my colleagues have difficulties in speeding up the sampling efficiency, where the previous implementation of Gaussian Process on CPU is too slow for us.

Thanks

Some questions about the diffusion model

I’m surprise to see that the Diffusion Model has been added to your repository of Bayesian models. But it is a little complicated to understand the code of Diffusion Model, is there any tutorial to guide me to generate images with the trained model. And It would be nice if there was an example.
Thanks

Some additional suggestions about Normalizing Flow model

I’m glad that this repository has collected many generative models, like the diffusion model, variational autoencoder, Generative Adversarial Networks and so on. But I didn’t find the flow-like model, would you like to add a model and demo of Normalizing Flow?

Thanks for your effort in building this repository of the generative model.

Warning: shape -36.502342 <= 0 in threads idx: 8258 [thread:(2125698480, 0), block:(66, 0)]

DPGDS error
I meet this warning when running dpgds demo:
Training Stage: epoch 0 takes 4.90 seconds. Likelihood: -0.425
Training Stage: epoch 1 takes 5.11 seconds. Likelihood: -0.367
Training Stage: epoch 2 takes 4.83 seconds. Likelihood: -0.373
Training Stage: epoch 3 takes 4.65 seconds. Likelihood: -0.376
Training Stage: epoch 4 takes 4.54 seconds. Likelihood: -0.378
Training Stage: epoch 5 takes 4.49 seconds. Likelihood: -0.377
Training Stage: epoch 6 takes 4.44 seconds. Likelihood: -0.377
Training Stage: epoch 7 takes 4.41 seconds. Likelihood: -0.377
Training Stage: epoch 8 takes 4.36 seconds. Likelihood: -0.374
Training Stage: epoch 9 takes 4.37 seconds. Likelihood: -0.369
Training Stage: epoch 10 takes 4.36 seconds. Likelihood: -0.367
Training Stage: epoch 11 takes 4.35 seconds. Likelihood: -0.363
Training Stage: epoch 12 takes 4.28 seconds. Likelihood: -0.360
Training Stage: epoch 13 takes 4.26 seconds. Likelihood: -0.357
Training Stage: epoch 14 takes 4.25 seconds. Likelihood: -0.349
Training Stage: epoch 15 takes 4.25 seconds. Likelihood: -0.343
Training Stage: epoch 16 takes 4.24 seconds. Likelihood: -0.334
Training Stage: epoch 17 takes 4.28 seconds. Likelihood: -0.326
Training Stage: epoch 18 takes 4.25 seconds. Likelihood: -0.315
Training Stage: epoch 19 takes 4.23 seconds. Likelihood: -0.306
Training Stage: epoch 20 takes 4.20 seconds. Likelihood: -0.294
Warning: shape -36.502342 <= 0 in threads idx: 8258 [thread:(2125698480, 0), block:(66, 0)]
Warning: shape -0.142884 <= 0 in threads idx: 8260 [thread:(2125698480, 0), block:(68, 0)]
Warning: shape -2.206542 <= 0 in threads idx: 8262 [thread:(2125698480, 0), block:(70, 0)]
Warning: shape -0.727849 <= 0 in threads idx: 8264 [thread:(2125698480, 0), block:(72, 0)]
Warning: shape -0.010437 <= 0 in threads idx: 8265 [thread:(2125698480, 0), block:(73, 0)]
Warning: shape -0.446480 <= 0 in threads idx: 8266 [thread:(2125698480, 0), block:(74, 0)]
Warning: shape -0.286718 <= 0 in threads idx: 8267 [thread:(2125698480, 0), block:(75, 0)]

Can you add the diffusion model?

At present, the diffusion model is popular, especially the DDPM, and it should also belong to the depth probability generative model.

A small bug

Dear writer, in the PyDPM/pydpm/model/deep_learning_pm/vae.py, attribute sample and forword in class VAE have some bugs (decoder is wrong, where vae_decoder is right) , and now have be checked.
BuFeng Ge

Issues about SawETM and WHAI

The email from Hjelkrem Tan, who is a PhD student in University of Oslo:

``
Dear Mr. Chaojie Wang,

I am a PhD student at the University of Oslo, Norway. I would very much like to use your PyDPM library in my research, as I have read several papers from you and your colleagues on topic models. Would you be able to answer some questions I have about the implementations in your GitHub repo?

I cannot find any implementation of SawETM in the PyDPM library. Are you planning to release this with PyDPM?
In pydpm/model/hybrid_pm/whai.py it seems that the implementation of the encoder does not include the stochastic downward part from the WHAI paper. Is this intentional or something you will change later?

I hope that you can clarify this for me. Thank you for your time!

With best regards,

Martine Hjelkrem Tan

PhD student, University of Oslo

Digital Signal Processing and Image Analysis Group
''

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.