Giter VIP home page Giter VIP logo

cosmo.pytorch's Introduction

postBG's GitHub stats

trophy

cosmo.pytorch's People

Contributors

numpee avatar postbg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

cosmo.pytorch's Issues

Why image is in png format?

def _create_img_path_from_id(root, id):
return os.path.join(root, '{}.png'.format(id))

Hi, @postBG . Thanks for your great work.

I guess the original images in FashionIQ dataset is in JPG format. While your code is reading images as PNG and thus leads to an error.

Do you have any pre-processing?

About FasionIQ

Hi,
I am very interested in this task.
However, I try to download the dress data of FasionIQ dataset but found that about 905 image URLs are missing. I finally get 18182 dress image data. I hope to know whether you get a complete dataset and how many images are included in the dataset.
What's more, can you send FashionIQ to my Gmali:[email protected].
Thanks.

LSTM hidden size

In text_encoedrs/lstm.py, the hidden size of LSTM is defined as:
lstm_hidden_size = kwargs.get('lstm_hidden_size', 512)

But in text_encoders/init.py, the desired lstm hidden size (from config) is not used.

Can you check at this? Thanks.

Problems about Fashion200k dataset

Hi, there.

Thank you for the great work!

According to fashion200k repo, there are two versions of Fashion200k pictures, namely cropped detected images and original images, which one do you use in your implementation?

problem with fashioniq dataset in the drive

hello. As mentioned in the paper there must be around 77000 images in this dataset. but the dataset in this drive contains around 74000 images. this causes problems in comparing results in academic research. do you have access to the full dataset?

Fashion200k result

Hello, I have some misunderstandings with the fashion200k dataset.

My reproduced result is -7~8% lower than the reported one of fashion200k, so I have several questions I wish you could answer for me.

  1. Does the gallery(database) contain all the images from data/labels/xxx_test_detect_all.txt?
  2. Are query images from data/test_queries.txt?
  3. In your paper and Github page, you said that the modifier should be "Change A to B". However, we find that it's actually "Replace A with B" in the code link you provided. Does it have a negative impact?
  4. The target image is not unique when testing fashion200k (, which is different from shoes and fashion-IQ). Is it correct to write the code of the fashion200k dataset part following fashion_IQ? Do there exist additional modifications?
  5. Could you please release the code of these three datasets.

These questions have been sent to you by email. I'm afraid that you are too busy to notice my email, so I raise this issue. I wish you could choose one to reply at your convenience. Thanks a lot.

Concatenate two captions?

Hi, @numpee

I also found a little different experimental setting between CosMo and other methods.

As shown in official FashionIQ evaluation codebase: the two captions of each triplet pair are concatenated into one sentence.

Following this setting, many other methods concatenate two captions while VAL doesn't.

I guess you don't concatenate two captions in order to make a fair comparison with VAL.

However, I guess you may receive higher performance with following this setting.

Actual differences between FashionIQ evaluation method and VAL evaluation method

Hi!
In the README file is pointed out that the evaluation method reported in the paper is slightly different from evaluation method of the original FashionIQ dataset (in order to match the method used by VAL).

However, I can't quite figure out which are the actual differences between the two methods.
Can someone explain them to me in detail?
Thanks, Alberto

Could you share the Shoes dataset?

Hi,

Thank you for the great work! I found the the original shoes dataset's link is broken, could you share the dataset? Thank you very much.

Different results from provided notebook.

Hi, @postBG . Thanks for your great work.

I am preparing vocab according to the instructions in README. However, I received different output results from jupyter_files/how_to_create_fashion_iq_vocab.ipynb.

My third code block's result is:

is solid black with no sleeves and is black with straps
B005X4PL1G

And the result of sixth is:
2957

I guess you intended to load test split to build vocab while the val split is loaded. But I am not sure. Plase give me some hints.

Doubts about the results of TIRG in the paper

Hi @numpee ,

For the results of TIRG on fashion_iq mentioned in main paper Table1 and supp material, did you reproduce several experiments with similar results?

I understand you just copy the results reported in VAL. However, I challenge these results are wrong.

According to my experiment, I can have following performance on original split with TIRG (with ResNet 50 and Bi-GRU, no glove and BERT embeddings were used here):

Shirt R@10 Shirt R@50 Dress R@10 Dress R@50 Toptee R@10 Toptee R@50
18.50 43.03 21.81 46.26 24.02 51.10

This performance is much better than both VAL and my produced CoSMo, also is very close to reported CoSMo.

Also, this paper has the same conclusion with me (although our settings are different, our comparison between VAL and TIRG is fair, please see Table 1 for details): TIRG is much better than VAL and the results reported in VAL are wrong.

If our observations are correct, then how to prove the performance effect of CoSMo (although I totally agree with CoSMo's insight)?

Please point me out if I were wrong. Thanks in advance.

About the result

image
I test the model in dress dataset, and get the result above. The results do not match the paper and I run it following the command you write. Do you know why ?

hyperparameter settings

Hi, I ran the example code (fashioniq dataset) with the default configuration provided in the thesis project, but couldn't achieve the results reported in the thesis. I wonder if there is a problem with my hyperparameter settings (adjusted random seeds, learning rate, etc.), the highest can achieve a top@50 accuracy of 44% on the toptee sub-dataset (the value reported in the paper is at 57% about).
If I want to reproduce the experimental results in the paper, could you please give me some suggestions for my experiments.

Seperately trained on FashionIQ subsets?

Hi, @numpee . Another question please.

I wonder did you train three models on FashionIQ dress/toptee/shirt separately?

In other words, are the results shown in Table 1 from one model or three models?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.