Giter VIP home page Giter VIP logo

Comments (3)

wangsibovictor avatar wangsibovictor commented on May 12, 2024

The current openNLP provided analyzer are quite inaccurate. I tried on some generated data that includes name, postcode, address, social security code etc. It does not recognize any of them.

Does the analyzer works fine for your workload? My previous test workload seems to work well. However, after generating some new workload, the analyzer does not recognize them. For example, if I put given name and first name together, it could identify person. If we put then into different columns, it will not recognize these columns as person.

from aurum-datadiscovery.

raulcf avatar raulcf commented on May 12, 2024

Yes. Entity analysis is going to be an entire new submodule I presume. At the moment is the most resource-intensive analysis and the most inaccurate of all.
This issue, however, is only about understanding when the entities detected for a field are good enough (the confidence of that entities representing the column is high) to stop processing. This is orthogonal to the performance and the quality of the entity analyzer.

from aurum-datadiscovery.

raulcf avatar raulcf commented on May 12, 2024

This was dealt with by other issue.

from aurum-datadiscovery.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.