Comments (10)
Thanks for your interest in the library. Here are the steps to use your folder based dataset:
-
Use torchvision.datasets.ImageFolder. This assumes images are organized into folders, where each folder is a class.
-
Write a wrapper class around the ImageFolder object:
from torch.utils.data import Dataset
from torchvision import datasets
import numpy as np
class YourDataset(Dataset):
def __init__(self, root, transform=None, download=False):
self.dataset = datasets.ImageFolder(root, transform=transform)
# these look useless, but are required by powerful-benchmarker
self.labels = np.array([b for (a, b) in self.dataset.imgs])
self.transform = transform
def __len__(self):
return len(self.dataset)
def __getitem__(self, idx):
return self.dataset[idx]
- Register the dataset at the bottom of run.py by replacing the last 2 lines with:
r = runner(**(args.__dict__))
from your_custom_modules import your_dataset
r.register("dataset", your_dataset)
r.run()
In the above, your_custom_modules
should be a folder that contains an empty __init__.py
file, as well as your_dataset.py
, which contains your custom dataset class. You can have multiple dataset classes in there, and they will all get registered. See the documentation for details.
- Add the following flags to the run command:
python run.py \
<all your other flags> \
--dataset~OVERRIDE~ {YourDataset: {root: /path/to/your/dataset}} \
--split_manager~APPLY~2 {data_and_label_getter_keys: null}
You can also make these changes directly in the config files. To do that, download the files to some location. (It's probably easiest to just download this whole repo, and then move the configs folder). Then set the --root_config_folder
flag to that location, either at the command line, or in run.py. These will now be considered the default config files, so you can change them however you like to minimize the number of command line flags. For example, you can make YourDataset the default instead of CUB200, by changing default.yaml
in the config_dataset
folder.
Let me know if that helps!
from powerful-benchmarker.
This answered my question. Thank you very much!! :)
from powerful-benchmarker.
Hello, I was able to successfully train/validate according to your detailed instructions.
Unfortunately, I bumped into an error during testing using the --splits_to_eval [test] method instructed in the documentations.
Any help or input will be valuable. Thank you!
INFO:root:Getting split: Test50_50_Partitions4_1 / train / length 11900 / using train transform
INFO:root:Creating end_of_epoch_hook
INFO:root:Getting split: Test50_50_Partitions4_1 / test / length 16100 / using eval transform
Traceback (most recent call last):
File "/home/hyunseung/anaconda3/envs/dota/lib/python3.7/site-packages/powerful_benchmarker/api_parsers/base_api_parser.py", line 27, in run_train_or_eval
self.run_for_each_split_scheme()
File "/home/hyunseung/anaconda3/envs/dota/lib/python3.7/site-packages/powerful_benchmarker/api_parsers/base_api_parser.py", line 50, in run_for_each_split_scheme
self.eval()
File "/home/hyunseung/anaconda3/envs/dota/lib/python3.7/site-packages/powerful_benchmarker/api_parsers/base_api_parser.py", line 66, in eval
self.setup_eval_and_run(load_best_model=True, use_input_embedder=False)
File "/home/hyunseung/anaconda3/envs/dota/lib/python3.7/site-packages/powerful_benchmarker/api_parsers/base_api_parser.py", line 70, in setup_eval_and_run
eval_dict = self.get_eval_dict(load_best_model, untrained, untrained, use_input_embedder=use_input_embedder)
File "/home/hyunseung/anaconda3/envs/dota/lib/python3.7/site-packages/powerful_benchmarker/api_parsers/base_api_parser.py", line 148, in get_eval_dict
best_epoch, _ = pml_cf.latest_version(self.model_folder, best=True)
File "/home/hyunseung/anaconda3/envs/dota/lib/python3.7/site-packages/pytorch_metric_learning/utils/common_functions.py", line 327, in latest_version
resume_epoch = max(version)
ValueError: max() arg is an empty sequence
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "run.py", line 46, in <module>
r.run()
File "/home/hyunseung/anaconda3/envs/dota/lib/python3.7/site-packages/powerful_benchmarker/runners/single_experiment_runner.py", line 18, in run
return self.run_new_experiment_or_resume(self.YR)
File "/home/hyunseung/anaconda3/envs/dota/lib/python3.7/site-packages/powerful_benchmarker/runners/single_experiment_runner.py", line 33, in run_new_experiment_or_resume
return self.start_experiment(args)
File "/home/hyunseung/anaconda3/envs/dota/lib/python3.7/site-packages/powerful_benchmarker/runners/single_experiment_runner.py", line 22, in start_experiment
run_output = api_parser.run()
File "/home/hyunseung/anaconda3/envs/dota/lib/python3.7/site-packages/powerful_benchmarker/api_parsers/base_api_parser.py", line 17, in run
return self.run_train_or_eval()
File "/home/hyunseung/anaconda3/envs/dota/lib/python3.7/site-packages/powerful_benchmarker/api_parsers/base_api_parser.py", line 36, in run_train_or_eval
raise ValueError
ValueError
from powerful-benchmarker.
Can you go to <experiment_folder>/Test50_50_Partitions4_1/saved_models
and see if there are any models saved there?
If yes, can you paste their names here or take a screenshot?
If no, can you check one of the other folders like <experiment_folder>/Test50_50_Partitions4_0/saved_models
?
from powerful-benchmarker.
As you noted, I think the problem is that my 4_1 partition does not contain best files for some reason. I checked 4_0, 4_2, 4_3 and they all contain the required files.
from powerful-benchmarker.
Hmm that's strange. Can you open Test50_50_Partitions4_1/saved_csvs/accuracies_normalized_compared_to_self_GlobalEmbeddingSpaceTester_level_0_VAL.csv
and paste the contents here?
from powerful-benchmarker.
Sure this is what I got for partition 4_1
from powerful-benchmarker.
It's strange because I just train/val/tested with triplet loss using the same dataset and it worked fine.
from powerful-benchmarker.
Ok I found the issue. The bug occurs when accuracy never improves over the initial model. (Epoch -1 is the initial trunk model, and epoch 0 is the initial trunk + embedder.) I've made a separate issue to address this #49.
In the meantime, just try to find hyperparameters that actually improve accuracy over the 0th model and testing should work. Alternatively, you can set check_untrained_accuracy
to False
, and that'll also work. But improving over the 0th model is probably what you want to do, and keeping the flag set to True
allows you to check that.
from powerful-benchmarker.
Yes I will try to improve accuracy using different hyperparameters. Thanks again for everything!
from powerful-benchmarker.
Related Issues (20)
- validator_tests/delete_pkls: Change --validator flag to --prefix
- Use --extend-ignore in linter rule
- Default split manager HOT 4
- Get error when execute script in 'powerful-benchmarker' directory. HOT 1
- Whem MClassPerSampler is equal to 1, Error occurs HOT 1
- Can I add my own learning rate scheduler during training? HOT 3
- Stanford Online Product training scheme HOT 5
- Cross Batch Memory error HOT 4
- Make trained models available on torch.hub
- Test on benchmark Error HOT 1
- Set accuracy evaluator to only get precision_at_k=1 HOT 1
- Clarification on precision_at_k computation HOT 2
- Question on SoftTripleLoss Bayes hyperparameter tuning HOT 2
- Reproducing benchmark results only gives validation accuracy HOT 2
- The ~INT_BAYESIAN~ flag does not seem to work HOT 1
- Proxy NCA And Softmax Scale / Label Smoothing HOT 3
- Compatibility with latest pytorch-metric-learning library HOT 1
- Trouble Reproducing ArcFace Results HOT 5
- Pretraining on source HOT 3
- How to evaluate an experiment (domain adaptation) HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from powerful-benchmarker.