sap218 / jabberwocky Goto Github PK
View Code? Open in Web Editor NEWNLP toolkit for those nonsensical ontologies
Home Page: https://sap218.github.io/jabberwocky/
License: MIT License
NLP toolkit for those nonsensical ontologies
Home Page: https://sap218.github.io/jabberwocky/
License: MIT License
removing class/synonyms from textual data and then looking at the left-over text
Note to anyone reading this: the code will probably be really bad. But I'm not an expert. Pls forgive. :-)
I am not that sophisticated of a python developer, but I often see a requirements.txt
used to declare requirements.
I did successfully install the software by using the prerequisites command in the README, but breaking these out into a dependencies file would probably be a good idea.
make a gh-pages with markdown of the README from jabber-test - include new test data!
Can you say which ontology formats are meant to work? I see that the pocket monsters example is in OWL/XML. I tried some other formats with catch
(RDF/XML, TTL, OFN) and they all gave the same output. I was kind of surprised, but I guess it's using only the terms in listofwords.txt
and not finding synonyms in the ontology? When I run with the OFN format file, I don't see any synonyms in ontology_dict_class_synonyms.json
. I think it would be good to document supported ontology formats, especially since OWL/XML is a fairly uncommon format.
Related question: I made my own OWL/XML file with a single term 'eye' and used the annotation property http://www.geneontology.org/formats/oboInOwl#hasExactSynonym
to add synonym 'visual organ'. I don't see this being found and added to ontology_dict_class_synonyms.json
.
Here is the ontology I used:
<?xml version="1.0"?>
<Ontology xmlns="http://www.w3.org/2002/07/owl#"
xml:base="http://example.org/ontologies/2020/4/28/untitled-ontology-1490"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:xml="http://www.w3.org/XML/1998/namespace"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
ontologyIRI="http://example.org/ontologies/2020/4/28/untitled-ontology-1490">
<Prefix name="owl" IRI="http://www.w3.org/2002/07/owl#"/>
<Prefix name="rdf" IRI="http://www.w3.org/1999/02/22-rdf-syntax-ns#"/>
<Prefix name="xml" IRI="http://www.w3.org/XML/1998/namespace"/>
<Prefix name="xsd" IRI="http://www.w3.org/2001/XMLSchema#"/>
<Prefix name="rdfs" IRI="http://www.w3.org/2000/01/rdf-schema#"/>
<Declaration>
<Class IRI="/eye"/>
</Declaration>
<Declaration>
<AnnotationProperty IRI="http://www.geneontology.org/formats/oboInOwl#hasExactSynonym"/>
</Declaration>
<AnnotationAssertion>
<AnnotationProperty IRI="http://www.geneontology.org/formats/oboInOwl#hasExactSynonym"/>
<IRI>/eye</IRI>
<Literal>visual organ</Literal>
</AnnotationAssertion>
<AnnotationAssertion>
<AnnotationProperty abbreviatedIRI="rdfs:label"/>
<IRI>/eye</IRI>
<Literal>eye</Literal>
</AnnotationAssertion>
</Ontology>
(this is part of JOSS review openjournals/joss-reviews#2168)
i understand catch
and bite
may not be useful names - i chose them as they are in the poem. if someone had any other ideas (i would like them to be consistent)
I installed jabberwocky and tried to run the first example from jabberwocky-tests. I got this output:
(pyenv) localhost:catch jim (master)$ catch --ontology ../ontology/pocketmonsters.owl --keywords listofwords.txt --textfile blogs_formatted.json --parameter blog_post > catch_output.txt
Traceback (most recent call last):
File "/Users/jim/Documents/Reviews/JOSS/pyenv/bin/catch", line 11, in <module>
load_entry_point('jabberwocky==0.5.0.1', 'console_scripts', 'catch')()
File "/Users/jim/Documents/Reviews/JOSS/pyenv/lib/python3.7/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/Users/jim/Documents/Reviews/JOSS/pyenv/lib/python3.7/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/Users/jim/Documents/Reviews/JOSS/pyenv/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Users/jim/Documents/Reviews/JOSS/pyenv/lib/python3.7/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/Users/jim/Documents/Reviews/JOSS/pyenv/lib/python3.7/site-packages/jabberwocky-0.5.0.1-py3.7.egg/catch/catch.py", line 426, in main
search_term_dictionary = ontologyPurl(ontology, keywords)
File "/Users/jim/Documents/Reviews/JOSS/pyenv/lib/python3.7/site-packages/jabberwocky-0.5.0.1-py3.7.egg/catch/catch.py", line 77, in ontologyPurl
souped = souping(ontology) # Reading in the ontology file
File "/Users/jim/Documents/Reviews/JOSS/pyenv/lib/python3.7/site-packages/jabberwocky-0.5.0.1-py3.7.egg/catch/catch.py", line 25, in souping
soup = BeautifulSoup(contents,'xml')
File "/Users/jim/Documents/Reviews/JOSS/pyenv/lib/python3.7/site-packages/bs4/__init__.py", line 228, in __init__
% ",".join(features))
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: xml. Do you need to install a parser library?
I'm running in a virtual environment with only the jabberwocky dependencies installed.
(this is part of JOSS review openjournals/joss-reviews#2168)
in relation to openjournals/joss-reviews#2168
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.