Giter VIP home page Giter VIP logo

Comments (12)

marcotcr avatar marcotcr commented on August 18, 2024 1

Sorry for the long delay in responding. The format you are using is pred_and_softmax, where the first column is the prediction and the next columns have the prediction probabilities. You'll note that your predictions do not match your probabilities: the first ten lines in your file have prediction=2 (positive), even though the probability of negative is 99%.

from checklist.

tongshuangwu avatar tongshuangwu commented on August 18, 2024

Hi! Could you please give me a more complete example test, so I can take a closer look? e.g. what do you see if you print test.data and test.conf

from checklist.

ahzz1207 avatar ahzz1207 commented on August 18, 2024

Hi! Could you please give me a more complete example test, so I can take a closer look? e.g. what do you see if you print test.data and test.conf

Hi, just gusse you are chinese from your user_name, so I post one example for you to take a look .

Examples: I remove the puncuation of sentence for INV test, because it's NER model so test.conf is None.
In jupyter html, it would be show like:
报道 说 , 印度 目前 外汇 储备 为 255.2亿→ 2552亿 美元 。 Pred: ['印度' '2552亿美元']→['印度' '255.2亿美元']
print(test.data):
[['报道说,印度目前外汇储备为255.2亿美元。', '报道说印度目前外汇储备为2552亿美元']]

Another issue is that if sentence and pred of testcase is too long, jupyter would not should the complete pred in testcase'box of html because I think the pred box can not split lines automatic which not like sentence . So I need to change width of jupyter to 200% to show the hole contenxt.

from checklist.

tongshuangwu avatar tongshuangwu commented on August 18, 2024

Thanks for catching both bugs!
I've fixed the display in 8a0f05c, and now the display should also automatically wrap lines for the prediction (in 7997666).
Both should work if you reinstall from the repo; We will bump the pip install later.
Please feel free to re-open the issue if you still run into problems!

(And yes, I'm Chinese :P)

from checklist.

ahzz1207 avatar ahzz1207 commented on August 18, 2024

Thanks for catching both bugs!
I've fixed the display in 8a0f05c, and now the display should also automatically wrap lines for the prediction (in 7997666).
Both should work if you reinstall from the repo; We will bump the pip install later.
Please feel free to re-open the issue if you still run into problems!

(And yes, I'm Chinese :P)

Thank you for doing that, I want to ask can I just copy some folders like checklist/viewer and checklist/visual_interface and then reinstall the procject? Because I modfied a lot for the project in some files.
By the way, I have seen your photo in 公众号 haha.

from checklist.

tongshuangwu avatar tongshuangwu commented on August 18, 2024

Yeah that works, just copy-n-replace /checklist/viewer/static/, and then install locally again: pip install -e . :)
On a side note, in case checklist has future updates, you probably want to consider fork the repo, so later you can just fetch our updates!
(Well, at least that photo is with cute doggy lol)

from checklist.

ahzz1207 avatar ahzz1207 commented on August 18, 2024

Yeah that works, just copy-n-replace /checklist/viewer/static/, and then install locally again: pip install -e . :)
On a side note, in case checklist has future updates, you probably want to consider fork the repo, so later you can just fetch our updates!
(Well, at least that photo is with cute doggy lol)

Hi, I just copy-n-replace /checklist/viewer/static/ just like you said, and then use pip install -e . in the project. But I get the Output below I don't know it reinstall successfully?

Installing collected packages: checklist
Attempting uninstall: checklist
Found existing installation: checklist 0.0.4
Can't uninstall 'checklist'. No files were found to uninstall.
Running setup.py develop for checklist
Successfully installed checklist

from checklist.

tongshuangwu avatar tongshuangwu commented on August 18, 2024

I couldn't really guess what's going on, but hopefully this stackoverflow thread can help?

Another hack to try, essentially forcing pip to upgrade the package:

  1. bump the version in setup.py (here)
  2. pip install --upgrade -e .

from checklist.

ahzz1207 avatar ahzz1207 commented on August 18, 2024

I couldn't really guess what's going on, but hopefully this stackoverflow thread can help?

Another hack to try, essentially forcing pip to upgrade the package:

  1. bump the version in setup.py (here)
  2. pip install --upgrade -e .

Thanks for reply, I do nothing but when I test the jupyter found that Bugs is gone!

from checklist.

MartaMarchiori avatar MartaMarchiori commented on August 18, 2024

Hi :)
I have a similar problem, but related to MFT test type. In the visual summary all is well, but in the textual summary the "Example fails" reports that phrases with negative sentiment have confidence for the negative label = 1.0, while in the visual table the example is reported as misclassified as 2, i.e. positive (Expect: 0, Pred: 2).

The strange thing is that I only get this error when I test a different model than the released ones (amazon, google, etc). Are there particular format-rules to follow besides saving the tests_n500 predictions in a txt file having the format "pred - prob for 0 - prob for 1 - prob for 2"?

Sentiment-laden words in context
Test cases: 8658
Test cases run: 500
Fails (rate): 208 (41.6%)

Example fails:
1.0 0.0 0.0 This was a creepy aircraft.
1.0 0.0 0.0 That cabin crew is creepy.
1.0 0.0 0.0 This food is lame.

from checklist.

marcotcr avatar marcotcr commented on August 18, 2024

Hm, this is odd. Can you provide us with a small example?
The default prediction file format is indeed the prediction followed by the softmax (no matter how many labels there are)

from checklist.

MartaMarchiori avatar MartaMarchiori commented on August 18, 2024

Yes, of course and thank you
These are the first 10 lines of the txt file situated in the predictions folder. The only thing that is different from roberta/amazon/* files is that the probabilities for Negative and Positive counts 5 decimals instead of 6, but I mean, ... This is not a real difference (right?)

2 0.99984 0.000000 0.00016
2 0.99190 0.000000 0.00810
2 0.99999 0.000000 0.00001
2 0.98929 0.000000 0.01071
2 0.99970 0.000000 0.00030
2 0.99999 0.000000 0.00001
2 0.99999 0.000000 0.00001
2 0.99848 0.000000 0.00152
2 0.99999 0.000000 0.00001
2 0.99998 0.000000 0.00002

I report the first lines of the textual summary too: the problem persists only when negative labels are involved

Vocabulary

single positive words
Test cases: 34
Fails (rate): 0 (0.0%)

single negative words
Test cases: 35
Fails (rate): 32 (91.4%)

Example fails:
1.0 0.0 0.0 regretted
1.0 0.0 0.0 dislike
0.4 0.0 0.6 horrible

single neutral words
Test cases: 13
Fails (rate): 13 (100.0%)

Example fails:
1.0 0.0 0.0 international
1.0 0.0 0.0 Israeli
1.0 0.0 0.0 see

Sentiment-laden words in context
Test cases: 8658
Test cases run: 500
Fails (rate): 208 (41.6%)

Example fails:
1.0 0.0 0.0 This airline is dreadful.
1.0 0.0 0.0 That service is terrible.
1.0 0.0 0.0 The cabin crew is terrible.

neutral words in context
Test cases: 1716
Test cases run: 500
Fails (rate): 500 (100.0%)

Example fails:
1.0 0.0 0.0 This is an American crew.
1.0 0.0 0.0 That customer service was Indian.
1.0 0.0 0.0 This was an Indian aircraft.

from checklist.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.