mlfoundations / task_vectors Goto Github PK

View Code? Open in Web Editor NEW

382.0 382.0 30.0 114 KB

Editing Models with Task Arithmetic

Python 100.00%

task_vectors's People

Contributors

Stargazers

Watchers

task_vectors's Issues

Corrupted Checkpoint ViT-L-14/Cars/finetuned.pt

Hi @gabrielilharco, when I download the ViT-L-14/Cars/finetuned.pt checkpoint from google drive and try to load the checkpoint, I am getting an error that the checkpoint is corrupted. All the other checkpoints work fine. I have tried downloading it multiple times and it is still not working. Would it be possible for you to reupload ViT-L-14/Cars/finetuned.pt checkpoint?

Classification Heads for Val Split.

Hi @gabrielilharco, I was trying to run some experiments on the validation dataset. I learned from the other issues that I need to use [dataset]Val to use the validation split. When trying this the code started to create a head for the validation split of the dataset. Is this the expected behavior? I thought that a single classifier would work for all splits. Am I missing something here?

Clarification about classification heads

Hi @gabrielilharco,
thanks for the great work!

I have a question about classification heads used in your experiments and available here. How exactly did you train them? Looking at the code I can see that that you manually construct a zero shot classifier based on embeddings of class names put in various templates. Importantly, you use the pretrained OpenCLIP model to calculate embeddings for classifier, not the model finetuned for a particular task. Am I right about that?

What is the reason for that? To me, the most natural way to obtain the classifier would be to get model.classification_head after the finetuning of a task specific model (here). This classification head is aligned with the finetuned model while the zeroshot head is aligned with pretrained model therefore the head from finetuning seems more suitable. Did you consider such an approach?

Corrupted Checkpoint ViT-B-16/Cars/finetuned.pt

When I try to load this checkpoint with torch.load, it raises

RuntimeError: Invalid magic number; corrupt file?

The md5sum of download file is 2c3b6a4f39e1b62def9ccf7288099e64.

How can I fine-tune t5 model with task_vectors method for text classification tasks?

Could you update the counterpart code,plz

Give a pre-trained model that can be loaded directly using the ‘model_weights = torch.load(file_path, map_location='cpu')’

While trying to load the file using Python’s pickle module, I encountered an _pickle.UnpicklingError, stating that persistent IDs in protocol 0 must be ASCII strings. Here is the exact error message:

_pickle.UnpicklingError: persistent IDs in protocol 0 must be ASCII strings

I attempted to resolve the issue by employing various methods, including utilizing the persistent_load parameter with pickle.Unpickler and trying to load the file in different environments, but unfortunately, all efforts have been in vain.

Request for Assistance:
Given the circumstances, I was hoping you could provide some insights or guidance on the following points:

Creation Environment: Could you share details about the environment in which the file was created, including the Python and PyTorch versions used?
Persistent IDs: Any information or context regarding the persistent IDs encountered in the file would be immensely helpful.
Loading Method: If there is a specific method or procedure to correctly load the file, could you please share it with me?
Additional Details: Any other details or specifications about the file that you think might assist in resolving the issue would be greatly appreciated.

Requesting the Multitask Checkpoint For Learning via Addition

Hi @gabrielilharco, I see that in Appendix D2 you mentioned that you have tried training a multitask checkpoint on the eight vision tasks (SUN397 Cars RESISC45 EuroSAT SVHN GTSRB MNIST DTD) for the learning via addition experiments. I can see that in Appendix D.2 you report the multitask normalized performance to be 99.4. Did you share the raw numbers on each task somewhere? If you can share these checkpoints for ViT-B-32 and ViT-L-14 that would be really helpful.

Thanks in advance,
Prateek

Value for args.data_location

It's unclear to me what should I assign as the value for args.data_location. The README tells args.data_location = '/path/to/data' but I'm not sure which folder that means.

Split definition of DTD, EuroSAT and SUN397

Hi, awesome work!

I'm trying to reproduce your results but I cannot find the split definitions you use for DTD, EuroSAT and SUN397. Would you mind pointing me to the right resources to download the versions of these datasets compatible with your code?

Thanks a lot!

Can't get attribute 'VisualTransformer' on <module 'open_clip.model' from '~/open_clip/model.py'>

Hi @gabrielilharco ,
Thank you for your exciting work! I tried to replicate the result using code from README.md. It showed an error when running

# Create the task vector
task_vector = TaskVector(pretrained_checkpoint, finetuned_checkpoint)

The error is:
AttributeError: Can't get attribute 'VisualTransformer' on <module 'open_clip.model' from '/srv/home/<user_name>/anaconda3/envs/task-vectors/lib/python3.10/site-packages/open_clip/model.py'>

The error is from loading the model with trained weights pretrained_state_dict = torch.load(pretrained_checkpoint).state_dict().

Could you help me with this? Thanks!

ImageNet and its split

Dear authors,

I am trying to reproduce you work based on this repo. Now I encounter a problem. It seems that ImageNet is not downloaded automatically in your repo. So, which ImageNet did you adopt? ILSVRC_2012? And any other changes ought to be applied to the datasets?

Best regards,
Hongduan

Task vectors for GPT2 model: attn.bias weights ignored

I am trying to use task vectors for GPT2 models.

In the TaskVectors class, when the task vector is created, there is a condition which ignores keys in the state dict that have dtype of uint8.

Due to this condition, when I call the apply_to() method of a task vector instance by passing a GPT-2 model checkpoint, I get the following error.

Warning: key transformer.h.0.attn.bias is present in the pretrained state dict but not in the task vector Warning: key transformer.h.1.attn.bias is present in the pretrained state dict but not in the task vector Warning: key transformer.h.2.attn.bias is present in the pretrained state dict but not in the task vector Warning: key transformer.h.3.attn.bias is present in the pretrained state dict but not in the task vector Warning: key transformer.h.4.attn.bias is present in the pretrained state dict but not in the task vector Warning: key transformer.h.5.attn.bias is present in the pretrained state dict but not in the task vector Warning: key transformer.h.6.attn.bias is present in the pretrained state dict but not in the task vector Warning: key transformer.h.7.attn.bias is present in the pretrained state dict but not in the task vector Warning: key transformer.h.8.attn.bias is present in the pretrained state dict but not in the task vector Warning: key transformer.h.9.attn.bias is present in the pretrained state dict but not in the task vector Warning: key transformer.h.10.attn.bias is present in the pretrained state dict but not in the task vector Warning: key transformer.h.11.attn.bias is present in the pretrained state dict but not in the task vector

Is there a reason why state dict keys with dtype torch.uint8 are ignored.

When that condition is removed, the code to run without any errors.

Please suggest what should be the best thing to do here.

mlfoundations / task_vectors Goto Github PK

task_vectors's People

Contributors

Stargazers

Watchers

Forkers

task_vectors's Issues

Recommend Projects

Recommend Topics

Recommend Org