Giter VIP home page Giter VIP logo

biodiversity's People

Contributors

dimus avatar locodelassembly avatar mjy avatar pleary avatar servis avatar wkollernhm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

biodiversity's Issues

Best practices for parsing a single name in native scripting language (daemon-mode?)

@mjy commented on Wed Jan 18 2017

Assuming the biodiversity gem is for all intents and purposes deprecated (i.e. no longer tracking individual improvements realized here), and assuming I want to use native Ruby, what's the best practice for using gnparser?

My use case is query processing, I get a single name string, break it down, and use the pieces to search against my index. Spawning to the shell is possible, but the JVM is loaded every time, so this isn't a good solution. Sending the query to a web endpoint would take too long. It seems like what's missing is a daemon style approach? Or, more than likely, I'm missing something.

It seems like this is an issue for all the native scripting languages (R, Python, etc) that can't/won't J-ify themselves for whatever reason.


@alexander-myltsev commented on Thu Jan 19 2017

@mjy , did you consider using https://github.com/GlobalNamesArchitecture/gnparser#usage-as-a-socket-server ? It expects new-line delimited list of strings -- each string is a name to parse.


@mjy commented on Thu Jan 19 2017

@alexander-myltsev Exploring that- currently not working, maybe my Java version? I followed the wget instructions, and the gnparse name "Homo sapiens" worked, then:

matt@MacBook-Pro-71 Downloads$ gnparse socket --port 1234 Exception in thread "main" java.lang.UnsupportedClassVersionError: akka/actor/ExtensionId : Unsupported major.minor version 52.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:637) at java.lang.ClassLoader.defineClass(ClassLoader.java:621) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283) at java.net.URLClassLoader.access$000(URLClassLoader.java:58) at java.net.URLClassLoader$1.run(URLClassLoader.java:197) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at org.globalnames.GnParser$.main(GnParser.scala:75) at org.globalnames.GnParser.main(GnParser.scala)


@alexander-myltsev commented on Mon Jan 23 2017

Sorry for late reply.

Akka uses Java 8. Alas, you should update Java to use it.

Possible bug: Biodiversity::Parser.parse() complains about two arguments given

Hi,

Just installed the latest biodiversity gem.

When trying the example:

Biodiversity::Parser.parse("Plantago major", simple = true)

I get that error:

/root/.gem/gems/biodiversity-5.5.2/lib/biodiversity/parser.rb:36:in `parse': wrong number of arguments (given 2, expected 1) (ArgumentError)

Has the API changed? If so then the example could perhaps be modified.

Name elements that indicate taxon is a virus

It looks like names with these elements are not yet recognized as viruses, so capitalized words are stripped from the canonical form:

NPV, e.g., Papilio polyxenes NPV: http://eol.org/pages/41592578
RNA, e.g., Alternaria zinniae dsRNA element: http://eol.org/pages/11611917
virophage, e.g., Organic Lake virophage: http://eol.org/pages/20868817
satellites, e.g., Double-stranded RNA satellites: http://eol.org/pages/11603787
satellite, e.g., Whitefly VEM satellite: http://eol.org/pages/20858522
betasatellite, e.g., Tomato leaf curl China betasatellite: http://eol.org/pages/11603870
alphasatellite, e.g., Ageratum yellow vein Singapore alphasatellite: http://eol.org/pages/39738381
particle, e.g., Mouse Intracisternal A-particle: http://eol.org/pages/11609198
subgroup, e.g., Subgroup B: http://eol.org/pages/11623168 -- This is probably not limited to viruses, but it's very unlikely that any name that has this string in it will have author information associated with it.

Porting to R

Hi there. I'm interested in porting this to R. However, i'm not sure how treetrop gem is used here. Is treetop required at run time (seems like it might be considering e.g., https://github.com/GlobalNamesArchitecture/biodiversity/blob/master/lib/biodiversity/parser/scientific_name_canonical.rb#L523), or just during development to create the Ruby classes/functions that are used in the biodiversity gem? Curious if I can just port your Ruby functions to R, but if treetop is required at run time that seems much harder as I don't think there's anything like treetop in R.

Names with low-cased subgenus are parsed like uninomials

Names like this are not parsed correctly because the parser does not handle low case genus information.

Monochamus (monohammus) galloprovincialis De Fluiter, 1950

Solution would be to accept names like this but ignore the content of the subgenus.

Parsever has to strip new lines

Before new lines were stripped by verbatim change, now verbatim stays unchanged and parserver needs do strip new lines itself.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.