Giter VIP home page Giter VIP logo

crf's People

Contributors

asher-stern avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

crf's Issues

Convergence criterion is wrong

L-BFGS convergence criterion is wrong (requires that the difference between subsequent function values would be small).

This leads to too many LBFGS iterations, and makes run-time significantly too long.

Make sure values don't become non-finite

It can happen (it, in fact has happened to me) that values calculated during the forward-backward algorithm become larger than Double.MAX_VALUE, and are represented using POSITIVE_INFINITY.

These are values we can't really compute with, leading to NaN values and wrecking our training procedure.

I would propose to use maximum finite double values in these cases. Add the following lines to CrfUtilities.safeAdd():

if (variable == Double.POSITIVE_INFINITY) variable = Double.MAX_VALUE;
if (variable == Double.NEGATIVE_INFINITY) variable = -Double.MAX_VALUE;

And also provide a function CrfUtilities.safeMultiply():

public static double safeMultiply(double variable, final double val2) {
    final double oldValue = variable;
    variable *= val2;
    if (variable == Double.POSITIVE_INFINITY) variable = Double.MAX_VALUE;
    if (variable == Double.NEGATIVE_INFINITY) variable = -Double.MAX_VALUE;
    if (!Double.isFinite(variable)) {
        //  Note that we have not added the check for ((val2 < 0.0) && (oldValue < variable)) || ((val2 > 0.0) && (oldValue > variable)), because floating point arithmetic is inexact
        throw new CrfException("Error: multiplying value to \"double\" variable yielded unexpected results. "
                + "variable was: " + String.format("%1$.3f", oldValue) + ", value to multiply was: " + String.format("%1$.3f", val2));
    }
    return variable;
}

And use these functions in all places where we do floating point calculations.

I don't know too much about the internals of this project, so I would like your feedback on this.

Hello,i want to have your e-mail or some other contact information.

Hi,
i am a student from China, and i am learning your CRF code.
But i can't get the training data --- Penn Tree-Bank corpus.
I know the corpus is not free, so I just want to get 10 - 20 sentence of the corpus for test.
could you copy some sentence to me for testing ?
Thank you very much !

LiKun,Zhengzhou University,China

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.