Giter VIP home page Giter VIP logo

Comments (10)

KevinMusgrave avatar KevinMusgrave commented on September 14, 2024

Thanks for your interest in the library. Here are the steps to use your folder based dataset:

  1. Use torchvision.datasets.ImageFolder. This assumes images are organized into folders, where each folder is a class.

  2. Write a wrapper class around the ImageFolder object:

from torch.utils.data import Dataset
from torchvision import datasets
import numpy as np

class YourDataset(Dataset):
    def __init__(self, root, transform=None, download=False):
        self.dataset = datasets.ImageFolder(root, transform=transform)

        # these look useless, but are required by powerful-benchmarker
        self.labels = np.array([b for (a, b) in self.dataset.imgs])
        self.transform = transform 
 
    def __len__(self):
        return len(self.dataset)

    def __getitem__(self, idx):
        return self.dataset[idx]
  1. Register the dataset at the bottom of run.py by replacing the last 2 lines with:
r = runner(**(args.__dict__))
from your_custom_modules import your_dataset
r.register("dataset", your_dataset)
r.run()

In the above, your_custom_modules should be a folder that contains an empty __init__.py file, as well as your_dataset.py, which contains your custom dataset class. You can have multiple dataset classes in there, and they will all get registered. See the documentation for details.

  1. Add the following flags to the run command:
python run.py \
<all your other flags> \
--dataset~OVERRIDE~ {YourDataset: {root: /path/to/your/dataset}} \ 
--split_manager~APPLY~2 {data_and_label_getter_keys: null} 

You can also make these changes directly in the config files. To do that, download the files to some location. (It's probably easiest to just download this whole repo, and then move the configs folder). Then set the --root_config_folder flag to that location, either at the command line, or in run.py. These will now be considered the default config files, so you can change them however you like to minimize the number of command line flags. For example, you can make YourDataset the default instead of CUB200, by changing default.yaml in the config_dataset folder.

Let me know if that helps!

from powerful-benchmarker.

gustmd0121 avatar gustmd0121 commented on September 14, 2024

This answered my question. Thank you very much!! :)

from powerful-benchmarker.

gustmd0121 avatar gustmd0121 commented on September 14, 2024

Hello, I was able to successfully train/validate according to your detailed instructions.
Unfortunately, I bumped into an error during testing using the --splits_to_eval [test] method instructed in the documentations.
Any help or input will be valuable. Thank you!

INFO:root:Getting split: Test50_50_Partitions4_1 / train / length 11900 / using train transform
INFO:root:Creating end_of_epoch_hook
INFO:root:Getting split: Test50_50_Partitions4_1 / test / length 16100 / using eval transform
Traceback (most recent call last):
  File "/home/hyunseung/anaconda3/envs/dota/lib/python3.7/site-packages/powerful_benchmarker/api_parsers/base_api_parser.py", line 27, in run_train_or_eval
    self.run_for_each_split_scheme()
  File "/home/hyunseung/anaconda3/envs/dota/lib/python3.7/site-packages/powerful_benchmarker/api_parsers/base_api_parser.py", line 50, in run_for_each_split_scheme
    self.eval()
  File "/home/hyunseung/anaconda3/envs/dota/lib/python3.7/site-packages/powerful_benchmarker/api_parsers/base_api_parser.py", line 66, in eval
    self.setup_eval_and_run(load_best_model=True, use_input_embedder=False)
  File "/home/hyunseung/anaconda3/envs/dota/lib/python3.7/site-packages/powerful_benchmarker/api_parsers/base_api_parser.py", line 70, in setup_eval_and_run
    eval_dict = self.get_eval_dict(load_best_model, untrained, untrained, use_input_embedder=use_input_embedder)
  File "/home/hyunseung/anaconda3/envs/dota/lib/python3.7/site-packages/powerful_benchmarker/api_parsers/base_api_parser.py", line 148, in get_eval_dict
    best_epoch, _ = pml_cf.latest_version(self.model_folder, best=True)
  File "/home/hyunseung/anaconda3/envs/dota/lib/python3.7/site-packages/pytorch_metric_learning/utils/common_functions.py", line 327, in latest_version
    resume_epoch = max(version)
ValueError: max() arg is an empty sequence

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "run.py", line 46, in <module>
    r.run()
  File "/home/hyunseung/anaconda3/envs/dota/lib/python3.7/site-packages/powerful_benchmarker/runners/single_experiment_runner.py", line 18, in run
    return self.run_new_experiment_or_resume(self.YR)
  File "/home/hyunseung/anaconda3/envs/dota/lib/python3.7/site-packages/powerful_benchmarker/runners/single_experiment_runner.py", line 33, in run_new_experiment_or_resume
    return self.start_experiment(args)
  File "/home/hyunseung/anaconda3/envs/dota/lib/python3.7/site-packages/powerful_benchmarker/runners/single_experiment_runner.py", line 22, in start_experiment
    run_output = api_parser.run()
  File "/home/hyunseung/anaconda3/envs/dota/lib/python3.7/site-packages/powerful_benchmarker/api_parsers/base_api_parser.py", line 17, in run
    return self.run_train_or_eval()
  File "/home/hyunseung/anaconda3/envs/dota/lib/python3.7/site-packages/powerful_benchmarker/api_parsers/base_api_parser.py", line 36, in run_train_or_eval
    raise ValueError
ValueError

from powerful-benchmarker.

KevinMusgrave avatar KevinMusgrave commented on September 14, 2024

Can you go to <experiment_folder>/Test50_50_Partitions4_1/saved_models and see if there are any models saved there?

If yes, can you paste their names here or take a screenshot?

If no, can you check one of the other folders like <experiment_folder>/Test50_50_Partitions4_0/saved_models?

from powerful-benchmarker.

gustmd0121 avatar gustmd0121 commented on September 14, 2024

screenshot

As you noted, I think the problem is that my 4_1 partition does not contain best files for some reason. I checked 4_0, 4_2, 4_3 and they all contain the required files.

from powerful-benchmarker.

KevinMusgrave avatar KevinMusgrave commented on September 14, 2024

Hmm that's strange. Can you open Test50_50_Partitions4_1/saved_csvs/accuracies_normalized_compared_to_self_GlobalEmbeddingSpaceTester_level_0_VAL.csv and paste the contents here?

from powerful-benchmarker.

gustmd0121 avatar gustmd0121 commented on September 14, 2024

csvfile

Sure this is what I got for partition 4_1

from powerful-benchmarker.

gustmd0121 avatar gustmd0121 commented on September 14, 2024

It's strange because I just train/val/tested with triplet loss using the same dataset and it worked fine.

from powerful-benchmarker.

KevinMusgrave avatar KevinMusgrave commented on September 14, 2024

Ok I found the issue. The bug occurs when accuracy never improves over the initial model. (Epoch -1 is the initial trunk model, and epoch 0 is the initial trunk + embedder.) I've made a separate issue to address this #49.

In the meantime, just try to find hyperparameters that actually improve accuracy over the 0th model and testing should work. Alternatively, you can set check_untrained_accuracy to False, and that'll also work. But improving over the 0th model is probably what you want to do, and keeping the flag set to True allows you to check that.

from powerful-benchmarker.

gustmd0121 avatar gustmd0121 commented on September 14, 2024

Yes I will try to improve accuracy using different hyperparameters. Thanks again for everything!

from powerful-benchmarker.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.