Giter VIP home page Giter VIP logo

ashutosh1919 / data2vec-pytorch Goto Github PK

View Code? Open in Web Editor NEW
6.0 3.0 2.0 119 KB

Ready to run PyTorch implementation of Data2Vec 2.0: Highly efficient self-supervised representation learning for vision, speech and text.

Home Page: https://blog.paperspace.com/data2vec/

License: MIT License

Python 93.84% Shell 5.26% Jupyter Notebook 0.90%
audio-machine-learning computer-vision data2vec deep-learning embedding-models multimodal-deep-learning nlp pytorch self-supervised-learning

data2vec-pytorch's People

Contributors

ashutosh1919 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

data2vec-pytorch's Issues

PermissionError!!! PermissionError: [Errno 13] Permission denied: 'data/train.raw'

Although I have given file permissions, an error still occurs when running the script. The error location has been marked in bold. Can you help me solve it?please

(pytorch1.10.0) E:\Code\notebooks>bash scripts/train_data2vec_multi_text.sh
Traceback (most recent call last):
File "E:\anaconda3\envs\pytorch1.10.0\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "E:\anaconda3\envs\pytorch1.10.0\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "E:\Code\notebooks\datasets\openwebtext\multiprocessing_bpe_encoder.py", line 150, in
main()
File "E:\Code\notebooks\datasets\openwebtext\multiprocessing_bpe_encoder.py", line 69, in main
inputs = [
File "E:\Code\notebooks\datasets\openwebtext\multiprocessing_bpe_encoder.py", line 70, in
stack.enter_context(open(input, "r", encoding="utf-8"))
PermissionError: [Errno 13] Permission denied: 'data/train.raw'

Traceback (most recent call last):
File "E:\anaconda3\envs\pytorch1.10.0\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "E:\anaconda3\envs\pytorch1.10.0\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "E:\Code\notebooks\datasets\openwebtext\multiprocessing_bpe_encoder.py", line 150, in
main()
File "E:\Code\notebooks\datasets\openwebtext\multiprocessing_bpe_encoder.py", line 69, in main
inputs = [
File "E:\Code\notebooks\datasets\openwebtext\multiprocessing_bpe_encoder.py", line 70, in
stack.enter_context(open(input, "r", encoding="utf-8"))
PermissionError: [Errno 13] Permission denied: 'data/valid.raw'
2024-03-11 11:01:49 | INFO | fairseq_cli.preprocess | Namespace(aim_repo=None, aim_run_hash=None, align_suffix=None, alignfile=None, all_gather_list_size=16384, amp=Fal
se, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, azureml_logging=False, bf16=False, bpe=None, cpu=False, criterion='cross_entropy', dataset_impl='mma
p', destdir='final_data/', dict_only=False, empty_cache_freq=0, fp16=False, fp16_init_scale=128, fp16_no_flatten_grads=False, fp16_scale_tolerance=0.0, fp16_scale_windo
w=None, joined_dictionary=False, log_file=None, log_format=None, log_interval=100, lr_scheduler='fixed', memory_efficient_bf16=False, memory_efficient_fp16=False, min_l
oss_scale=0.0001, model_parallel_size=1, no_progress_bar=False, nwordssrc=-1, nwordstgt=-1, on_cpu_convert_precision=False, only_source=True, optimizer=None, padding_fa
ctor=8, plasma_path='/tmp/plasma', profile=False, quantization_config_path=None, reset_logging=False, scoring='bleu', seed=1, source_lang=None, srcdict=None, suppress_c
rashes=False, target_lang=None, task='translation', tensorboard_logdir=None, testpref=None, tgtdict=None, threshold_loss_scale=None, thresholdsrc=0, thresholdtgt=0, tok
enizer=None, tpu=False, trainpref='data/train.bpe', use_plasma_view=False, user_dir=None, validpref='data/valid.bpe', wandb_project=None, workers=100)
Traceback (most recent call last):
File "E:\anaconda3\envs\pytorch1.10.0\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "E:\anaconda3\envs\pytorch1.10.0\lib\runpy.py", line 87, in run_code
exec(code, run_globals)
File "E:\anaconda3\envs\pytorch1.10.0\Scripts\fairseq-preprocess.exe_main
.py", line 7, in
File "E:\anaconda3\envs\pytorch1.10.0\lib\site-packages\fairseq_cli\preprocess.py", line 389, in cli_main
main(args)
File "E:\anaconda3\envs\pytorch1.10.0\lib\site-packages\fairseq_cli\preprocess.py", line 340, in main
src_dict = _build_dictionary(
File "E:\anaconda3\envs\pytorch1.10.0\lib\site-packages\fairseq_cli\preprocess.py", line 87, in _build_dictionary
return task.build_dictionary(
File "E:\anaconda3\envs\pytorch1.10.0\lib\site-packages\fairseq\tasks\fairseq_task.py", line 121, in build_dictionary
Dictionary.add_file_to_dictionary(
File "E:\anaconda3\envs\pytorch1.10.0\lib\site-packages\fairseq\data\dictionary.py", line 354, in add_file_to_dictionary
offsets = find_offsets(local_file, num_workers)
File "E:\anaconda3\envs\pytorch1.10.0\lib\site-packages\fairseq\file_chunker_utils.py", line 25, in find_offsets
with open(filename, "r", encoding="utf-8") as f:
PermissionError: [Errno 13] Permission denied: 'data/train.bpe'
Traceback (most recent call last):
File "E:\anaconda3\envs\pytorch1.10.0\lib\site-packages\fairseq_cli\hydra_train.py", line 27, in hydra_main
_hydra_main(cfg)
File "E:\anaconda3\envs\pytorch1.10.0\lib\site-packages\fairseq_cli\hydra_train.py", line 31, in _hydra_main
add_defaults(cfg)
File "E:\anaconda3\envs\pytorch1.10.0\lib\site-packages\fairseq\dataclass\initialize.py", line 61, in add_defaults
cfg[k] = merge_with_parent(dc, field_cfg)
File "E:\anaconda3\envs\pytorch1.10.0\lib\site-packages\fairseq\dataclass\utils.py", line 500, in merge_with_parent
merged_cfg = OmegaConf.merge(dc, cfg)
omegaconf.errors.ConfigKeyError: Key 'include_index' not in 'MaskedLMConfig'
full_key: include_index
reference_type=Optional[MaskedLMConfig]
object_type=MaskedLMConfig

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
scripts/train_data2vec_multi_text.sh: line 11: distributed_training.distributed_world_size=1: command not found

RecursionError: maximum recursion depth exceeded

[Previous line repeated 962 more times] !!!!

The following is the terminal error message

Traceback (most recent call last):
File "E:\anaconda3\envs\pytorch1.10.0\lib\site-packages\fairseq_cli\hydra_train.py", line 27, in hydra_main
_hydra_main(cfg)
File "E:\anaconda3\envs\pytorch1.10.0\lib\site-packages\fairseq_cli\hydra_train.py", line 56, in hydra_main
distributed_utils.call_main(cfg, pre_main, **kwargs)
File "E:\anaconda3\envs\pytorch1.10.0\lib\site-packages\fairseq\distributed\utils.py", line 369, in call_main
main(cfg, **kwargs)
File "E:\anaconda3\envs\pytorch1.10.0\lib\site-packages\fairseq_cli\train.py", line 96, in main
model = task.build_model(cfg.model)
File "E:\anaconda3\envs\pytorch1.10.0\lib\site-packages\fairseq\tasks\fairseq_task.py", line 343, in build_model
model = models.build_model(cfg, self, from_checkpoint)
File "E:\anaconda3\envs\pytorch1.10.0\lib\site-packages\fairseq\models_init
.py", line 106, in build_model
return model.build_model(cfg, task)
File "E:\Code\notebooks\data2vec\models\data2vec2.py", line 392, in build_model
return cls(cfg, modalities, task=task, skip_ema=cfg.skip_ema)
File "E:\Code\notebooks\data2vec\models\data2vec2.py", line 251, in init
self.ema = self.make_ema_teacher(cfg.ema_decay)
File "E:\anaconda3\envs\pytorch1.10.0\lib\site-packages\torch\autograd\grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "E:\Code\notebooks\data2vec\models\data2vec2.py", line 302, in make_ema_teacher
return EMAModule(
File "E:\anaconda3\envs\pytorch1.10.0\lib\site-packages\fairseq\modules\ema_module.py", line 42, in init
self.model = copy.deepcopy(model)
File "E:\anaconda3\envs\pytorch1.10.0\lib\copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "E:\anaconda3\envs\pytorch1.10.0\lib\copy.py", line 270, in _reconstruct
state = deepcopy(state, memo)
File "E:\anaconda3\envs\pytorch1.10.0\lib\copy.py", line 146, in deepcopy
y = copier(x, memo)
File "E:\anaconda3\envs\pytorch1.10.0\lib\copy.py", line 230, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "E:\anaconda3\envs\pytorch1.10.0\lib\copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "E:\anaconda3\envs\pytorch1.10.0\lib\copy.py", line 270, in _reconstruct
state = deepcopy(state, memo)
File "E:\anaconda3\envs\pytorch1.10.0\lib\copy.py", line 146, in deepcopy
y = copier(x, memo)
File "E:\anaconda3\envs\pytorch1.10.0\lib\copy.py", line 230, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "E:\anaconda3\envs\pytorch1.10.0\lib\copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "E:\anaconda3\envs\pytorch1.10.0\lib\copy.py", line 271, in _reconstruct
if hasattr(y, 'setstate'):
File "E:\anaconda3\envs\pytorch1.10.0\lib\site-packages\fairseq\tasks\fairseq_task.py", line 40, in getattr
if name not in self._state and name in self._factories:
File "E:\anaconda3\envs\pytorch1.10.0\lib\site-packages\fairseq\tasks\fairseq_task.py", line 40, in getattr
if name not in self._state and name in self._factories:
File "E:\anaconda3\envs\pytorch1.10.0\lib\site-packages\fairseq\tasks\fairseq_task.py", line 40, in getattr
if name not in self._state and name in self._factories:
[Previous line repeated 962 more times]
RecursionError: maximum recursion depth exceeded

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.