nnaisense / bayesian-flow-networks Goto Github PK

View Code? Open in Web Editor NEW

228.0 12.0 25.0 3.02 MB

This is the official code release for Bayesian Flow Networks.

License: Apache License 2.0

Python 100.00%

bayesian-flow-networks's Issues

How to generate bfn.gif ?

I wanna know how to generate bfn.gif, would you like to share the code which implement this?

Errors while running test.py

When I try to run

python test.py seed=1 config_file=./configs/mnist_discrete.yaml load_model=./pretrained-BFNs/mnist_ema.pt n_steps=784 n_repeats=2000

to test the pre-trained model, I got this error info says
ImportError: cannot import name 'get_generator' from 'utils_train'
(I've already run git clone [email protected]:rupspace/pretrained-BFNs successfully.)

I checked utils_train.py and found that there is no get_generator. However, I see function get_generator in its history commit 834d896:

def get_generator(seed: int):
    g = torch.Generator()
    g.manual_seed(seed)
    return g

After adding this function to utils_train.py, the error info changed to:

UnpicklingError: Weights only load failed. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution.Do it only if 
you get the file from a trusted source. WeightsUnpickler error: Unsupported operand 118

I tried to change weights_only from True to False, but it doesn't work.

Then I changed the model to my own checkpoint at ./checkpoints/BFN/best/ema_model.pt (trained with your code, of course), with weights_only as True, problem solved.

Therefore, there might be some code to fix and models to update. :-)

Train Discrete BFN with Larger Vocabulary?

Hi,
The discrete BFN presented in the paper has demonstrated competitive performance on the text8 dataset. However, the vocabulary size of text8, which stands at a mere 27, is considerably limited for most NLP tasks. I am curious to know if you have experimented with training discrete BFN models on datasets with a larger vocabulary. If that is the case, could you provide some insights into the model's architecture, settings of hyper parameters, and the performance achieved?
Thanks!

There maybe an error in the calculation of the best validation loss in train.py

Thanks for this excellent work! I really the code implement of BFN is very beautiful, both the code structure and style.
But when I take a close look at the training part, I found an error(maybe) in the calculation of the best validation loss:

best_val_loss = validate(
      cfg=cfg,
      model=model,
      ema_model=ema_model,
      val_dataloader=dataloaders["val"],
      step=step,
      run=run,
      pbar=pbar,
      best_val_loss=best_val_loss,
      checkpoint_root_dir=checkpoint_root_dir,
      accelerator=accelerator,
)

because validate() always return the current validation loss, I think we should change some way as below:

best_val_loss = validate(
      cfg=cfg,
      model=model,
      ema_model=ema_model,
      val_dataloader=dataloaders["val"],
      step=step,
      run=run,
      pbar=pbar,
      best_val_loss=best_val_loss,
      checkpoint_root_dir=checkpoint_root_dir,
      accelerator=accelerator,
)
best_val_loss = min(val_loss, best_val_loss)

Am I right? waiting for your reply, thx!

pre-trained model

Is there a pre-trained model here?

Dataloader workers in different gpus may get the same randomness when multi-processes training

Hi, it's me again! I think there maybe a problem with dataloader reseeding workers in multi-gpus training, workers with the same worker_id in different gpus will get the same randomness if we use the way as below(as repo):

bayesian-flow-networks/utils_train.py

Line 60 in 896ea20

def worker_init_function(worker_id: int) -> None:

def worker_init_function(worker_id: int) -> None:
    """https://pytorch.org/docs/stable/notes/randomness.html#dataloader"""
    worker_seed = torch.initial_seed() % 2**32
    np.random.seed(worker_seed)
    random.seed(worker_seed)

bayesian-flow-networks/utils_train.py

Line 67 in 896ea20

def get_generator(seed: int):

def get_generator(seed: int):
    g = torch.Generator()
    g.manual_seed(seed)
    return g

One way to avoid this problem is to seed generator by the specified seed and the rank, and this may look like:

def get_generator(seed: int):
    import torch.distributed as dist
    
    rank = dist.get_rank()
    seed += rank

    g = torch.Generator()
    g.manual_seed(seed)
    
    return g

Following this way, we don't even have to set worker_init_fn in dataloader, and different gpus will have different _base_seed in their dataloaders, finally making them(each worker in each gpu) own their unique randomness.

nnaisense / bayesian-flow-networks Goto Github PK

bayesian-flow-networks's Issues

How to generate bfn.gif ?

Errors while running test.py

Train Discrete BFN with Larger Vocabulary?

There maybe an error in the calculation of the best validation loss in train.py

pre-trained model

Dataloader workers in different gpus may get the same randomness when multi-processes training

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent