victorchall / everydream Goto Github PK

View Code? Open in Web Editor NEW

225.0 225.0 33.0 519 KB

Advanced fine tuning tools for vision models

License: GNU Affero General Public License v3.0

Python 70.74% Batchfile 1.71% Jupyter Notebook 27.55%

everydream's People

Contributors

Stargazers

Watchers

everydream's Issues

Running autocaption fails with wrong size of tensor.

When trying to run auto caption, the script fails with:

Windows detected, using asyncio.WindowsSelectorEventLoopPolicy
starting
input_dir:  input
Downloading model to .cache/model_base_caption_capfilt_large.pth... please wait
Model cached to: .cache/model_base_caption_capfilt_large.pth
Downloading (…)solve/main/vocab.txt: 100%|██████████████████████████████| 232k/232k [00:00<00:00, 6.17MB/s]
Downloading (…)okenizer_config.json: 100%|██████████████████████████████| 28.0/28.0 [00:00<00:00, 14.0kB/s]
Downloading (…)lve/main/config.json: 100%|█████████████████████████████████| 570/570 [00:00<00:00, 228kB/s]
load checkpoint from .cache/model_base_caption_capfilt_large.pth
loading model to cuda
working image:  input\00012-1722407061-gigapixel-standard-height-1024px.jpg
Traceback (most recent call last):
  File ".\scripts\auto_caption.py", line 217, in <module>
    asyncio.run(main(opt))
  File "C:\Users\ssuuk\anaconda3\envs\dl\lib\asyncio\runners.py", line 43, in run
    return loop.run_until_complete(main)
  File "C:\Users\ssuuk\anaconda3\envs\dl\lib\asyncio\base_events.py", line 608, in run_until_complete
    return future.result()
  File ".\scripts\auto_caption.py", line 157, in main
    captions = blip_decoder.generate(image, sample=sample, num_beams=16, min_length=opt.min_length, \
  File "scripts/BLIP\models\blip.py", line 156, in generate
    outputs = self.text_decoder.generate(input_ids=input_ids,
  File "C:\Users\ssuuk\anaconda3\envs\dl\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Users\ssuuk\anaconda3\envs\dl\lib\site-packages\transformers\generation\utils.py", line 1524, in generate
    return self.beam_search(
  File "C:\Users\ssuuk\anaconda3\envs\dl\lib\site-packages\transformers\generation\utils.py", line 2810, in beam_search
    outputs = self(
  File "C:\Users\ssuuk\anaconda3\envs\dl\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "scripts/BLIP\models\med.py", line 886, in forward
    outputs = self.bert(
  File "C:\Users\ssuuk\anaconda3\envs\dl\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "scripts/BLIP\models\med.py", line 781, in forward
    encoder_outputs = self.encoder(
  File "C:\Users\ssuuk\anaconda3\envs\dl\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "scripts/BLIP\models\med.py", line 445, in forward
    layer_outputs = layer_module(
  File "C:\Users\ssuuk\anaconda3\envs\dl\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "scripts/BLIP\models\med.py", line 361, in forward
    cross_attention_outputs = self.crossattention(
  File "C:\Users\ssuuk\anaconda3\envs\dl\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "scripts/BLIP\models\med.py", line 277, in forward
    self_outputs = self.self(
  File "C:\Users\ssuuk\anaconda3\envs\dl\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "scripts/BLIP\models\med.py", line 178, in forward
    attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
RuntimeError: The size of tensor a (16) must match the size of tensor b (256) at non-singleton dimension 0
(dl) PS D:\Projekty\EveryDream>```

Is it possible to resize smallest dimension to the target size, and then randomly crop the final amount during training?

Its not very clear from documentation whether or not this is possible yet in this project's code, unless I'm missing something major.

I would like to resize my training images so that their smallest dimension is for example 512px, and then the larger dimension is whatever larger size. Then during training, the a random 512x512 section is cropped from the 512 by whatever size image. The caption information for the full image would remain attached to the crop made from it.

Running auto_caption with torch_device cpu results in fatal error

I am running /EveryDream/scripts/auto_caption.py --torch_device cpu and it throws me RuntimeError: The size of tensor a (16) must match the size of tensor b (256) at non-singleton dimension 0

I am not exactly sure what this means, or how I could resolve it.
Note, I have no GPU, for this experiment I am intending to just caption one single image on my CPU (which is a 2.6 GHz 6-Core Intel Core i7 with 16 GB 2667 MHz DDR4)

Add support for Stable Diffusion 2.0

Add support for training 2.0 related models.

Models can be found in this repo: https://github.com/Stability-AI/stablediffusion

refactor to OO

...and get rid of globals. Should probably be splitting stuff up into classes

domain conn limit

do something to throttle downloads per domain, may have issues with limits on other end, maybe shuffle matches or get fancy and track them?

Split output to subfolders option

--split n

so you can do crazy numbers

ex. --split 1000 will create out_dir/n subfolder every 1000 images downloaded

Hi, this is the first time I see something with .parquet.
I cloned https://huggingface.co/datasets/laion/laion2B-en-aesthetic and manually copy and pasted 127 parquet files in laion folder. then ran downlaod_laion.py and I get this result. I need help

(venv) root@n8f6ytisqn:/notebooks/EveryDream# python scripts/download_laion.py --search_text "a man" --limit 50
Launching...
is running in venv: True
{Fore.CYAN}Unix detected, using default asyncio event loop policy{Style.RESET_ALL}
  Searching for a man in column: TEXT in ./laion//*.parquet
  reading file: ./laion/part-00051-9230b837-b1e0-4254-8b88-ed2976e9cee9-c000.snappy.parquet
Traceback (most recent call last):
  File "/notebooks/EveryDream/scripts/download_laion.py", line 322, in <module>
    result = asyncio.run(download_laion_matches(opt))
  File "/usr/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
    return future.result()
  File "/notebooks/EveryDream/scripts/download_laion.py", line 274, in download_laion_matches
    df = pd.read_parquet(file, engine="auto")
  File "/notebooks/EveryDream/venv/lib/python3.9/site-packages/pandas/io/parquet.py", line 503, in read_parquet
    return impl.read(
  File "/notebooks/EveryDream/venv/lib/python3.9/site-packages/pandas/io/parquet.py", line 251, in read
    result = self.api.parquet.read_table(
  File "/notebooks/EveryDream/venv/lib/python3.9/site-packages/pyarrow/parquet/__init__.py", line 2780, in read_table
    dataset = _ParquetDatasetV2(
  File "/notebooks/EveryDream/venv/lib/python3.9/site-packages/pyarrow/parquet/__init__.py", line 2368, in __init__
    [fragment], schema=schema or fragment.physical_schema,
  File "pyarrow/_dataset.pyx", line 898, in pyarrow._dataset.Fragment.physical_schema.__get__
  File "pyarrow/error.pxi", line 144, in pyarrow.lib.pyarrow_internal_check_status
  File "pyarrow/error.pxi", line 100, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Could not open Parquet input source '<Buffer>': Parquet magic bytes not found in footer. Either the file is corrupted or this is not a parquet file.

coroutines

code should probably be using coroutines

victorchall / everydream Goto Github PK

everydream's People

Contributors

Stargazers

Watchers

Forkers

everydream's Issues

Running autocaption fails with wrong size of tensor.

Is it possible to resize smallest dimension to the target size, and then randomly crop the final amount during training?

Running auto_caption with torch_device cpu results in fatal error

Add support for Stable Diffusion 2.0

refactor to OO

domain conn limit

Split output to subfolders option

Test cases and tests for better filenaming

aspect ratio limit

Error (solved)

coroutines

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent