There's been a steady trickle of reports that LSI/LDA misbehave, produce degenerate mo

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Be attentive with <a class="issue-link js-issue-link" data-error-text="Failed to load

Add more data sanity checks about gensim HOT 7 OPEN

piskvorky commented on May 14, 2024

Add more data sanity checks

from gensim.

Comments (7)

prakhar2b commented on May 14, 2024

@tmylk I would like to work on this. @piskvorky the link above is broken, could you brief me what it was about ?

from gensim.

piskvorky commented on May 14, 2024

@prakhar2b I think it was a scipy crash (segfault) when using sparse arrays and indexing an element out-of-bounds.

I wouldn't say this issues is "easy" -- it will need some careful thinking and planning. We definitely don't want to slow down processing too much, by (for example) requiring an extra data pass just to check for bad values.

from gensim.

rasto2211 commented on May 14, 2024

I would like to work on this issue. Could you please give me some pointers to the code where to start?

from gensim.

tmylk commented on May 14, 2024

@rasto2211 Adding a warning to LdaModel.init when the input is a list is a good way to start ( item 4 above)

from gensim.

rasto2211 commented on May 14, 2024

@piskvorky @menshikh-iv Do you also want to close this issue since you closed my PR without merging?

from gensim.

menshikh-iv commented on May 14, 2024

@rasto2211 No, because the remaining points are important (see Radim comment)

from gensim.

menshikh-iv commented on May 14, 2024

Be attentive with #1732, I already see exactly same problem twice

from gensim.

Add more data sanity checks about gensim HOT 7 OPEN

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent