<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clip

Thanks for catching both bugs! I've fixed the display in <a class="commit-link" da

Thanks for catching both bugs! I've fixed the display in <a href="htt

I couldn't really guess what's going on, but hopefully this <a href="https://stackover

I couldn't really guess what's going on, but hopefully this <a href="http

In jupyter viwer, INV and DIR test would change two sentences pred in reverse order. about checklist HOT 12 CLOSED

marcotcr commented on August 18, 2024

In jupyter viwer, INV and DIR test would change two sentences pred in reverse order.

from checklist.

Comments (12)

marcotcr commented on August 18, 2024 1

Sorry for the long delay in responding. The format you are using is pred_and_softmax, where the first column is the prediction and the next columns have the prediction probabilities. You'll note that your predictions do not match your probabilities: the first ten lines in your file have prediction=2 (positive), even though the probability of negative is 99%.

from checklist.

tongshuangwu commented on August 18, 2024

Hi! Could you please give me a more complete example test, so I can take a closer look? e.g. what do you see if you print test.data and test.conf

from checklist.

ahzz1207 commented on August 18, 2024

Hi! Could you please give me a more complete example test, so I can take a closer look? e.g. what do you see if you print test.data and test.conf

Hi, just gusse you are chinese from your user_name, so I post one example for you to take a look .

Examples: I remove the puncuation of sentence for INV test, because it's NER model so test.conf is None.
In jupyter html, it would be show like:
报道说 , 印度目前外汇储备为 255.2亿→ 2552亿美元。 Pred: ['印度' '2552亿美元']→['印度' '255.2亿美元']
print(test.data):
[['报道说,印度目前外汇储备为255.2亿美元。', '报道说印度目前外汇储备为2552亿美元']]

Another issue is that if sentence and pred of testcase is too long, jupyter would not should the complete pred in testcase'box of html because I think the pred box can not split lines automatic which not like sentence . So I need to change width of jupyter to 200% to show the hole contenxt.

from checklist.

tongshuangwu commented on August 18, 2024

Thanks for catching both bugs!
I've fixed the display in 8a0f05c, and now the display should also automatically wrap lines for the prediction (in 7997666).
Both should work if you reinstall from the repo; We will bump the pip install later.
Please feel free to re-open the issue if you still run into problems!

(And yes, I'm Chinese :P)

from checklist.

ahzz1207 commented on August 18, 2024

Thanks for catching both bugs!
I've fixed the display in 8a0f05c, and now the display should also automatically wrap lines for the prediction (in 7997666).
Both should work if you reinstall from the repo; We will bump the pip install later.
Please feel free to re-open the issue if you still run into problems!

(And yes, I'm Chinese :P)

Thank you for doing that, I want to ask can I just copy some folders like checklist/viewer and checklist/visual_interface and then reinstall the procject? Because I modfied a lot for the project in some files.
By the way, I have seen your photo in 公众号 haha.

from checklist.

tongshuangwu commented on August 18, 2024

Yeah that works, just copy-n-replace /checklist/viewer/static/, and then install locally again: pip install -e . :)
On a side note, in case checklist has future updates, you probably want to consider fork the repo, so later you can just fetch our updates!
(Well, at least that photo is with cute doggy lol)

from checklist.

ahzz1207 commented on August 18, 2024

Yeah that works, just copy-n-replace /checklist/viewer/static/, and then install locally again: pip install -e . :)
On a side note, in case checklist has future updates, you probably want to consider fork the repo, so later you can just fetch our updates!
(Well, at least that photo is with cute doggy lol)

Hi, I just copy-n-replace /checklist/viewer/static/ just like you said, and then use pip install -e . in the project. But I get the Output below I don't know it reinstall successfully?

Installing collected packages: checklist
Attempting uninstall: checklist
Found existing installation: checklist 0.0.4
Can't uninstall 'checklist'. No files were found to uninstall.
Running setup.py develop for checklist
Successfully installed checklist

from checklist.

tongshuangwu commented on August 18, 2024

I couldn't really guess what's going on, but hopefully this stackoverflow thread can help?

Another hack to try, essentially forcing pip to upgrade the package:

bump the version in setup.py (here)
pip install --upgrade -e .

from checklist.

ahzz1207 commented on August 18, 2024

I couldn't really guess what's going on, but hopefully this stackoverflow thread can help?

Another hack to try, essentially forcing pip to upgrade the package:

bump the version in setup.py (here)

pip install --upgrade -e .

Thanks for reply, I do nothing but when I test the jupyter found that Bugs is gone!

from checklist.

MartaMarchiori commented on August 18, 2024

Hi :)
I have a similar problem, but related to MFT test type. In the visual summary all is well, but in the textual summary the "Example fails" reports that phrases with negative sentiment have confidence for the negative label = 1.0, while in the visual table the example is reported as misclassified as 2, i.e. positive (Expect: 0, Pred: 2).

The strange thing is that I only get this error when I test a different model than the released ones (amazon, google, etc). Are there particular format-rules to follow besides saving the tests_n500 predictions in a txt file having the format "pred - prob for 0 - prob for 1 - prob for 2"?

Sentiment-laden words in context
Test cases: 8658
Test cases run: 500
Fails (rate): 208 (41.6%)

Example fails:
1.0 0.0 0.0 This was a creepy aircraft.
1.0 0.0 0.0 That cabin crew is creepy.
1.0 0.0 0.0 This food is lame.

from checklist.

marcotcr commented on August 18, 2024

Hm, this is odd. Can you provide us with a small example?
The default prediction file format is indeed the prediction followed by the softmax (no matter how many labels there are)

from checklist.

MartaMarchiori commented on August 18, 2024

Yes, of course and thank you
These are the first 10 lines of the txt file situated in the predictions folder. The only thing that is different from roberta/amazon/* files is that the probabilities for Negative and Positive counts 5 decimals instead of 6, but I mean, ... This is not a real difference (right?)

2 0.99984 0.000000 0.00016
2 0.99190 0.000000 0.00810
2 0.99999 0.000000 0.00001
2 0.98929 0.000000 0.01071
2 0.99970 0.000000 0.00030
2 0.99999 0.000000 0.00001
2 0.99999 0.000000 0.00001
2 0.99848 0.000000 0.00152
2 0.99999 0.000000 0.00001
2 0.99998 0.000000 0.00002

I report the first lines of the textual summary too: the problem persists only when negative labels are involved

Vocabulary

single positive words
Test cases: 34
Fails (rate): 0 (0.0%)

single negative words
Test cases: 35
Fails (rate): 32 (91.4%)

Example fails:
1.0 0.0 0.0 regretted
1.0 0.0 0.0 dislike
0.4 0.0 0.6 horrible

single neutral words
Test cases: 13
Fails (rate): 13 (100.0%)

Example fails:
1.0 0.0 0.0 international
1.0 0.0 0.0 Israeli
1.0 0.0 0.0 see

Sentiment-laden words in context
Test cases: 8658
Test cases run: 500
Fails (rate): 208 (41.6%)

Example fails:
1.0 0.0 0.0 This airline is dreadful.
1.0 0.0 0.0 That service is terrible.
1.0 0.0 0.0 The cabin crew is terrible.

neutral words in context
Test cases: 1716
Test cases run: 500
Fails (rate): 500 (100.0%)

Example fails:
1.0 0.0 0.0 This is an American crew.
1.0 0.0 0.0 That customer service was Indian.
1.0 0.0 0.0 This was an Indian aircraft.

from checklist.

In jupyter viwer, INV and DIR test would change two sentences pred in reverse order. about checklist HOT 12 CLOSED

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent