jiahao000 / mfm Goto Github PK
View Code? Open in Web Editor NEW[ICLR 2023] Masked Frequency Modeling for Self-Supervised Visual Pre-Training
License: Other
[ICLR 2023] Masked Frequency Modeling for Self-Supervised Visual Pre-Training
License: Other
thank you very much for your MFM !!!
when i run (bash dist_finetune.sh ...) , get error
how can i run (bash dist_finetune.sh ...) with only 1 gpu , not multi gpu ?
/opt/conda/envs/py3.9_cuda11.8/lib/python3.9/site-packages/torch/distributed/launch.py:181: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use-env is set by default in torchrun.
If your script expects `--local-rank` argument to be set, please
change it to read from `os.environ['LOCAL_RANK']` instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions
warnings.warn(
': [Errno 2] No such file or directoryhon: can't open file '/mnt/d/Software/AI/mfm/2304/mfm_1/
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 2) local_rank: 0 (pid: 80) of binary: /opt/conda/envs/py3.9_cuda11.8/bin/python
Traceback (most recent call last):
File "/opt/conda/envs/py3.9_cuda11.8/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/opt/conda/envs/py3.9_cuda11.8/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/opt/conda/envs/py3.9_cuda11.8/lib/python3.9/site-packages/torch/distributed/launch.py", line 196, in <module>
main()
File "/opt/conda/envs/py3.9_cuda11.8/lib/python3.9/site-packages/torch/distributed/launch.py", line 192, in main
launch(args)
File "/opt/conda/envs/py3.9_cuda11.8/lib/python3.9/site-packages/torch/distributed/launch.py", line 177, in launch
run(args)
File "/opt/conda/envs/py3.9_cuda11.8/lib/python3.9/site-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/opt/conda/envs/py3.9_cuda11.8/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 133, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/envs/py3.9_cuda11.8/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 249, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2023-05-09_06:58:55
host : LZH2.localdomain
rank : 0 (local_rank: 0)
exitcode : 2 (pid: 80)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================
dist_finetune.sh: line 10: main_finetune.py: command not found
dist_finetune.sh: line 11: --cfg: command not found
Hi, authors.
Thank you for the great work. I really enjoyed reading through your work.
In the mean time, I got a question on the loss function (in the code).
From the paper, the MFM computes losses on real and imaginary part of frequency representations between the prediction and target.
Now, my question is in your code (line 64 and 65 in frequency_loss.py), you are doing
tmp = (recon_freq - real_freq) ** 2
loss = torch.sqrt(tmp[..., 0] + tmp[..., 1] + 1e-12) ** self.loss_gamma, given recon_freq and real_freq are tensors of complex number of shape (NxPxCxHxW).
Then, what would "tmp[..., 0] + tmp[..., 1]" mean?
Are you computing the loss only over the first and second elements of W dimension?
Please correct me if I am misunderstanding anything from your code above.
in paper it said that "In practice, we compute the loss only on the masked area of the frequency spectrum instead of the
full spectrum as the latter tends to decrease the accuracy according to our experiments."
however, in code i found that “loss = (loss_recon * (1 - mask.unsqueeze(1))).sum() / (1 - mask).sum() / self.in_chans / loss_recon.shape[1]”
it's actually contradictory, could you explain it? thank you
` # 2D FFT
x_freq = torch.fft.fft2(image)
# shift low frequency to the center
x_freq = torch.fft.fftshift(x_freq, dim=(-2, -1))
# mask a portion of frequencies
x_freq_masked = x_freq
# restore the original frequency order
x_freq_masked = torch.fft.ifftshift(x_freq_masked, dim=(-2, -1))
# 2D iFFT (only keep the real part)
x_corrupted = torch.fft.ifft2(x_freq_masked).real
x_corrupted = torch.clamp(x_corrupted, min=0., max=1.)
x_np = x_corrupted.numpy()
im = Image.fromarray((x_np * 255).astype(np.uint8))
im.save(os.path.join(output_folder, filename))`
The image I save with the above code is far from the image of the cat example, can you provide a demo of this please?
I'm excited by your work, I've done related experiments in the frequency domain before and didn't get better experimental results, you've given me new ideas, can you provide a copy of the pretrain model for visualizing the test images in test.py?
I would appreciate it !!!!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.