Giter VIP home page Giter VIP logo

Comments (4)

yanbeic avatar yanbeic commented on September 13, 2024

Hi,
In this paper, one caption is used as text input in training and evaluation. This follows the same spirit as the previous CVPR19 paper and the FashionIQ paper. Since the training and the evaluation are consistent, this setup should be reasonable.

Note that in the original FashionIQ paper, there are no results reported for the exact same task considered in this paper. Our reported results are obtained by re-implementation of prior methods under the same setup, so the evaluation in our paper is comparable.

from val.

helson73 avatar helson73 commented on September 13, 2024

Sorry, I confused the task in their paper ("The fashionIQ dataset : ...") with their FashionIQ challenge.

If all evaluation results in your result chart are obtained by yourself and evaluated under the same protocol, indeed they are comparable, it's my bad to confuse about them.

However, the guys who released FashionIQ dataset also organized a contest named "FashionIQ challenge", I think you mentioned results from this task in your paper as "un-published SOTA". (Although they didn't published them on major journals and conferences, their reports are available.)

The "FashionIQ challenge" is the exact same task you did in your work, except for one thing, the evaluation protocol.

The "FashionIQ challenge" treats two captions together as one text input, as well as all participants of FashionIQ challenge 2019, 2020 follow the same rule. On the official fashionIQ dataset, they released a "starter-kit" for evaluating baseline (TIRG) on fashionIQ dataset, in these "starter-kit", they follows the same rule, treat two captions together as one input.

In perspective of a starter who wants to know about this task at the first time, one usually firstly access the official website, and then one highly likely would use the "starter-kit" as the starter point. If one had read your paper, one might possibly think that the task in your paper is the same task in FashionIQ challenge.

I understand your concern about consistency with other works (CVPR19 paper for instance), but in this case there is already an official challenge of specific dataset exists, you did the same task, mentioned them in your paper (not in the chart, but mentioned anyway), but used a different evaluation protocol without mentioning it directly, which may cause confusion.

If possible, I suggest you put a notice about this difference, at least in the readme file :)

from val.

helson73 avatar helson73 commented on September 13, 2024

I have another question about evaluation result table in your paper.

The paper shows large gap between the evaluation result from the original TIRG implementation and the result from the re-implementation on FashionIQ dataset, I wondering what cause this? More specifically, what is difference between original TIRG and your re-implemented one?

from val.

yanbeic avatar yanbeic commented on September 13, 2024

The difference is the backbone network. ResNet-50 is used on FashionIQ in this paper.

from val.

Related Issues (13)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.