Giter VIP home page Giter VIP logo

Comments (6)

automaticgiant avatar automaticgiant commented on July 19, 2024

totally happens on 35055f8 (newest at this time)

from jsat.

automaticgiant avatar automaticgiant commented on July 19, 2024

It seems that with certain components, a HashedTextVectorCreator can be used directly, but TfIDF and OkapiBM25 need it to be internal to a HashedTextDataLoader, so that it is initialized on a corpus so that weighting can be done. I seem to have the options to either use a data loader or setWeight myself, but I guess I'm leaning towards the former. I'll give it a shot when I'm free today.
It could be more of a documentation issue.

from jsat.

EdwardRaff avatar EdwardRaff commented on July 19, 2024

Does this happen with the normal TextDataLoader? I'll hopefully get to testing this later tonight.

from jsat.

EdwardRaff avatar EdwardRaff commented on July 19, 2024

Ok, now that I've read this it's a documentation issue. The HashedTextVectorCreator expects the word weighting to already be configured. I'm going to try and write some improved Javadoc right now and improve the error message.

As I look back at this code, I think it could definitely be improved. I'm going to add it to my refactoring list in #1 .

from jsat.

EdwardRaff avatar EdwardRaff commented on July 19, 2024

Just tried adding some better documentation to the class descriptions. Please take a look and let me know if it clears stuff up

from jsat.

automaticgiant avatar automaticgiant commented on July 19, 2024

f0e3a5f is super helpful. We are redoing the dataflow now to accomodate.

from jsat.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.