krr-oxford / deeponto Goto Github PK
View Code? Open in Web Editor NEWA package for ontology engineering with deep learning and language models.
Home Page: https://krr-oxford.github.io/DeepOnto/
License: Apache License 2.0
A package for ontology engineering with deep learning and language models.
Home Page: https://krr-oxford.github.io/DeepOnto/
License: Apache License 2.0
Dear all,
thank you for DeepOnto.
I was wondering whether there is an example code for consistency checking, e.g.
from deeponto.onto import Ontology
onto = Ontology("path_to_ontology.owl", "hermit")
assert onto.consistent()
Is your feature request related to a problem? Please describe.
Using the ELK reasoner prints a lot of console message.
Describe the solution you'd like
Disable INFO level message.
I have tried to run BERTMap, but got the following error:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select)
In fact, it is a bug that was introduced in Transformer 4.12.3 and has been fixed in 4.13.0. For short, the output of Tokenizer is BatchEncoding
, but the Trainer only transfers Union[torch.Tensor, Tuple, List, Dictionary]
to GPU.
(Please refer to this link for more details When running the Trainer cell, it found two devices (cuda:0 and CPU))
I think this bug is introduced in this commit 086a25cae945d496765cbbb09b36f9780d676ac7. Please consider fixing the version of Transformer.
This is Shriram and I mailed you recently regarding my interest in making use of DeepOnto, I am currently using 2 different autonomous vehicles ontology and am unable to run the BertMap model due to "ValueError: evaluation strategy steps requires either non-zero --eval_steps or --logging_steps". I am unaware as to where this error is arising from.
/usr/local/lib/python3.10/dist-packages/transformers/training_args.py in post_init(self)
1301 self.eval_steps = self.logging_steps
1302 else:
-> 1303 raise ValueError(
1304 f"evaluation strategy {self.evaluation_strategy} requires either non-zero --eval_steps or"
1305 " --logging_steps"
ValueError: evaluation strategy steps requires either non-zero --eval_steps or --logging_steps
this is the entire error I am getting,
Could the number of instances in my ontology be any reason for this error? I even tried multiple value changes to my config yaml file, none of them work. Kindly help me with the same.
Thanks in advance!
Describe the bug
After following the installation instructions (via conda) of deeponto+pytorch+jpype, I cannot import the Ontology
class. It seems that all org.*
packages cannot be imported.
Setting and exporting JAVA_HOME
manually does not seem to fix the issue.
It fails with the following error (full traceback below):
ModuleNotFoundError: No module named 'org.slf4j'
Any advice would be appreciated.
NOTICE: I am not familiar with the combination of python x java
To Reproduce
Steps to reproduce the behavior:
conda create -n deeponto python=3.10
conda activate deeponto
conda install pytorch torchvision torchaudio cpuonly -c pytorch -c conda-forge
conda install -c conda-forge jpype1
from deeponto.onto import Ontology
Expected behavior
The Ontology
is imported.
Screenshots
Traceback below:
โ ipython
Python 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:45:18) [GCC 12.3.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.24.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from deeponto.onto import Ontology
Please enter the maximum memory located to JVM [8g]:
INFO:deeponto:8g maximum memory allocated to JVM.
INFO:deeponto:JVM started successfully.
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[1], line 1
----> 1 from deeponto.onto import Ontology
File ~/.local/share/micromamba/envs/deeponto/lib/python3.10/site-packages/deeponto/onto/__init__.py:14
1 # Copyright 2021 Yuan He. All rights reserved.
2
3 # Licensed under the Apache License, Version 2.0 (the "License");
(...)
12 # See the License for the specific language governing permissions and
13 # limitations under the License.
---> 14 from .ontology import Ontology, OntologyReasoner
15 from .pruning import OntologyPruner
16 from .verbalisation import OntologyVerbaliser, OntologySyntaxParser
File ~/.local/share/micromamba/envs/deeponto/lib/python3.10/site-packages/deeponto/onto/ontology.py:47
45 from java.io import File # type: ignore
46 from java.lang import Runtime, System # type: ignore
---> 47 from org.slf4j.impl import SimpleLogger # type: ignore
48 System.setProperty(SimpleLogger.DEFAULT_LOG_LEVEL_KEY, "warn") # set slf4j default logging level to warning
49 from org.semanticweb.owlapi.apibinding import OWLManager # type: ignore
ModuleNotFoundError: No module named 'org.slf4j'
Desktop (please complete the following information):
Hi, I am not able to generate the exact H@1 and MRR for EditSim for the FMA SNOMED task as reported in Table 4 in https://arxiv.org/pdf/2205.03447.pdf.
This is the command used:
python om_eval.py --saved_path './om_results' --pred_path './onto_match_experiment2/edit_sim/global_match/src2tgt' --ref_anchor_path 'data/equiv_match/refs/snomed2fma.body/unsupervised/src2tgt.rank/for_eval' --hits_at 1
These are the generated numbers: H@1: .841 and MRR: .89
Reported nos. in the paper: H@1: 869 and MRR: .895
I am not sure why the numbers are not consistent.
Is there anything that needs to be modified in the code to get the reported numbers?
Bug Description
I'm trying to verbalise a class expression. The code I'm executing is as follows:
from deeponto.onto import Ontology, OntologyVerbaliser, OntologySyntaxParser
onto = Ontology("ontology.owl")
verbaliser = OntologyVerbaliser(onto)
complex_concepts = list(onto.get_asserted_complex_classes())
v_concept = verbaliser.verbalise_class_expression(complex_concepts[0])
Where ontology.owl
is a simple ontology of RDF/XML syntax that contains an atomic concept, a datatype property and a complex concept. The whole ontology provided in Additional Context
I get the following error:
Traceback (most recent call last):
File "/home/pg-xai2/sampling/examples/prova_deeponto.py", line 42, in <module>
v_concept = verbaliser.verbalise_class_expression(complex_concepts[0])
File "/home/pg-xai2/.conda/envs/ontolearn/lib/python3.9/site-packages/deeponto/onto/verbalisation.py", line 227, in verbalise_class_expression
return self._verbalise_junction(parsed_class_expression)
File "/home/pg-xai2/.conda/envs/ontolearn/lib/python3.9/site-packages/deeponto/onto/verbalisation.py", line 334, in _verbalise_junction
other_children.append(self.verbalise_class_expression(child))
File "/home/pg-xai2/.conda/envs/ontolearn/lib/python3.9/site-packages/deeponto/onto/verbalisation.py", line 214, in verbalise_class_expression
return self._verbalise_iri(parsed_class_expression)
File "/home/pg-xai2/.conda/envs/ontolearn/lib/python3.9/site-packages/deeponto/onto/verbalisation.py", line 254, in _verbalise_iri
verbal = self.vocab[iri] if not self.keep_iri else iri_node.text
KeyError: 'http://dl-learner.org/mutagenesis#Compound'
This is the printed complex concept (maybe you can just try to manually construct this concept and test it out):
ObjectIntersectionOf(<http://dl-learner.org/mutagenesis#Compound> DataSomeValuesFrom(<http://dl-learner.org/mutagenesis#act> DatatypeRestriction(xsd:decimal facetRestriction(minInclusive "0.04"^^xsd:decimal))))
To Reproduce
Execute the code described above using the given ontology.
Additional context
OS: Linux
content of ontology.owl
:
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xml:base="http://dl-learner.org/mutagenesis"
xmlns="http://dl-learner.org/mutagenesis#">
<owl:Ontology rdf:about="http://dl-learner.org/mutagenesis"/>
<owl:DatatypeProperty rdf:about="#act">
<rdfs:domain rdf:resource="#Compound"/>
<rdfs:range rdf:resource="http://www.w3.org/2001/XMLSchema#double"/>
</owl:DatatypeProperty>
<owl:Class rdf:about="#Compound"/>
<owl:Class rdf:about="http://dl-learner.org/Pred_1">
<rdfs:subClassOf rdf:resource="http://www.w3.org/2002/07/owl#Thing"/>
<owl:equivalentClass>
<owl:Class>
<owl:intersectionOf rdf:parseType="Collection">
<rdf:Description rdf:about="#Compound"/>
<owl:Restriction>
<owl:onProperty rdf:resource="#act"/>
<owl:someValuesFrom>
<rdfs:Datatype>
<owl:onDatatype rdf:resource="http://www.w3.org/2001/XMLSchema#decimal"/>
<owl:withRestrictions>
<rdf:Description>
<rdf:first>
<rdf:Description>
<xsd:minInclusive rdf:datatype="http://www.w3.org/2001/XMLSchema#decimal">0.04</xsd:minInclusive>
</rdf:Description>
</rdf:first>
<rdf:rest rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#nil"/>
</rdf:Description>
</owl:withRestrictions>
</rdfs:Datatype>
</owl:someValuesFrom>
</owl:Restriction>
</owl:intersectionOf>
</owl:Class>
</owl:equivalentClass>
</owl:Class>
</rdf:RDF>
I tried other ontologies as well including Carcinogenesis and the whole Mutagenesis which you can find here. Since they do not contain complex concepts I tried to verbalize a sub class axioms like following:
# get subsumption axioms from the ontology
subsumption_axioms = onto.get_subsumption_axioms(entity_type="Classes")
# verbalise the first subsumption axiom
v_sub, v_super = verbaliser.verbalise_class_subsumption_axiom(subsumption_axioms[0])
The same kind of error as mentioned earlier occurred.
how can find embedding ontology for own dataset?
While exploring the documentation I just came across a small typo in the example code snippets. There is a quotation mark missing in the a line of code there (see the link below). Nothing concerning but I just wanted to let you know. It may cause some unexpected error for those who copy-paste the code :)
onto.get_subsumption_axioms(entity_type="Classes)
--> onto.get_subsumption_axioms(entity_type="Classes")
In addition to a library, consider also creating a Dockerfile which uses FastAPI to serve web APIs that can be used. For instance, instead of having to import the library, I can deploy a docker container and call the APIs for which I will provide all the necessary inputs.
Describe the bug
Under some circumstances dureing the mapping extensions stage the tokenizer throws the error IndexError: list index out of range
.
The error originates at bert_classifier.py line 185.
This is the same error and same location inside the tokenizer of huggingface/tokenizers#993 , which was caused by the data passed to the tokenizer.
To Reproduce
I have reproduced this error with these settings:
Logs & stack trace | max_length_for_input |
batch_size_for_training |
Source ontology | Target ontology |
---|---|---|---|---|
link | 256 | 16 | music-representation.owl | musicClasses.owl @ 2ebb641 |
link | 128 | 8 | core.owl | musicClasses.owl @ ebc2d09 |
Expected behavior
The stage and the pipeline should complete successfully
Platform:
Describe the bug
The BERTMap model got stuck at the mapping extension phase.
To Reproduce
Steps to reproduce the behavior:
Run BERTMap on SNOMED-FMA (Body) task.
Describe the bug
While running BERTMap I'm receiving an error "ZeroDivisionError: division by zero"
To Reproduce
Launch BERTMap with these input files:
- configuration: bertmap.yaml
- source ontology: ontology-network.ttl
- target ontology: music.owl
Expected behavior
Mapping search between the ontologies should work normally
Actual output
[Time: 00:18:47] - [PID: 172] - [Model: bertmap]
Load the following configurations:
{
"model": "bertmap",
"output_path": "/content",
"annotation_property_iris": [
"http://www.w3.org/2000/01/rdf-schema#label",
"http://www.geneontology.org/formats/oboInOwl#hasSynonym",
"http://www.geneontology.org/formats/oboInOwl#hasExactSynonym",
"http://www.w3.org/2004/02/skos/core#exactMatch",
"http://www.ebi.ac.uk/efo/alternative_term",
"http://www.orpha.net/ORDO/Orphanet_#symbol",
"http://purl.org/sig/ont/fma/synonym",
"http://www.w3.org/2004/02/skos/core#prefLabel",
"http://www.w3.org/2004/02/skos/core#altLabel",
"http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#P108",
"http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#P90"
],
"known_mappings": null,
"auxiliary_ontos": [],
"bert": {
"pretrained_path": "bert-base-uncased",
"max_length_for_input": 128,
"num_epochs_for_training": 3.0,
"batch_size_for_training": 16,
"batch_size_for_prediction": 128,
"resume_training": null
},
"global_matching": {
"enabled": true,
"num_raw_candidates": 200,
"num_best_predictions": 10,
"mapping_extension_threshold": 0.8,
"mapping_filtered_threshold": 0.9
}
}
[Time: 00:00:00] - [PID: 172] - [Model: bertmap]
Load the following configurations:
{
"model": "bertmap",
"output_path": "/content",
"annotation_property_iris": [
"http://www.w3.org/2000/01/rdf-schema#label",
"http://www.geneontology.org/formats/oboInOwl#hasSynonym",
"http://www.geneontology.org/formats/oboInOwl#hasExactSynonym",
"http://www.w3.org/2004/02/skos/core#exactMatch",
"http://www.ebi.ac.uk/efo/alternative_term",
"http://www.orpha.net/ORDO/Orphanet_#symbol",
"http://purl.org/sig/ont/fma/synonym",
"http://www.w3.org/2004/02/skos/core#prefLabel",
"http://www.w3.org/2004/02/skos/core#altLabel",
"http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#P108",
"http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#P90"
],
"known_mappings": null,
"auxiliary_ontos": [],
"bert": {
"pretrained_path": "bert-base-uncased",
"max_length_for_input": 128,
"num_epochs_for_training": 3.0,
"batch_size_for_training": 16,
"batch_size_for_prediction": 128,
"resume_training": null
},
"global_matching": {
"enabled": true,
"num_raw_candidates": 200,
"num_best_predictions": 10,
"mapping_extension_threshold": 0.8,
"mapping_filtered_threshold": 0.9
}
}
INFO:bertmap:Load the following configurations:
{
"model": "bertmap",
"output_path": "/content",
"annotation_property_iris": [
"http://www.w3.org/2000/01/rdf-schema#label",
"http://www.geneontology.org/formats/oboInOwl#hasSynonym",
"http://www.geneontology.org/formats/oboInOwl#hasExactSynonym",
"http://www.w3.org/2004/02/skos/core#exactMatch",
"http://www.ebi.ac.uk/efo/alternative_term",
"http://www.orpha.net/ORDO/Orphanet_#symbol",
"http://purl.org/sig/ont/fma/synonym",
"http://www.w3.org/2004/02/skos/core#prefLabel",
"http://www.w3.org/2004/02/skos/core#altLabel",
"http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#P108",
"http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#P90"
],
"known_mappings": null,
"auxiliary_ontos": [],
"bert": {
"pretrained_path": "bert-base-uncased",
"max_length_for_input": 128,
"num_epochs_for_training": 3.0,
"batch_size_for_training": 16,
"batch_size_for_prediction": 128,
"resume_training": null
},
"global_matching": {
"enabled": true,
"num_raw_candidates": 200,
"num_best_predictions": 10,
"mapping_extension_threshold": 0.8,
"mapping_filtered_threshold": 0.9
}
}
[Time: 00:18:47] - [PID: 172] - [Model: bertmap]
Save the configuration file at /content/bertmap/config.yaml.
[Time: 00:00:00] - [PID: 172] - [Model: bertmap]
Save the configuration file at /content/bertmap/config.yaml.
INFO:bertmap:Save the configuration file at /content/bertmap/config.yaml.
[Time: 00:18:47] - [PID: 172] - [Model: bertmap]
Construct new text semantics corpora and save at /content/bertmap/data/text-semantics.corpora.json.
[Time: 00:00:00] - [PID: 172] - [Model: bertmap]
Construct new text semantics corpora and save at /content/bertmap/data/text-semantics.corpora.json.
INFO:bertmap:Construct new text semantics corpora and save at /content/bertmap/data/text-semantics.corpora.json.
---------------------------------------------------------------------------
ZeroDivisionError Traceback (most recent call last)
[<ipython-input-12-a888744a31b2>](https://localhost:8080/#) in <cell line: 1>()
----> 1 bertmap = BERTMapPipeline(src_onto, tgt_onto, config)
6 frames
[/usr/local/lib/python3.10/dist-packages/deeponto/align/bertmap/pipeline.py](https://localhost:8080/#) in __init__(self, src_onto, tgt_onto, config)
119 # load or construct the corpora
120 self.corpora_path = os.path.join(self.data_path, "text-semantics.corpora.json")
--> 121 self.corpora = self.load_text_semantics_corpora()
122
123 # load or construct fine-tune data
[/usr/local/lib/python3.10/dist-packages/deeponto/align/bertmap/pipeline.py](https://localhost:8080/#) in load_text_semantics_corpora(self)
251 corpora.save(self.data_path)
252
--> 253 return self.load_or_construct(self.corpora_path, data_name, construct)
254
255 self.logger.info(f"No training needed; skip the construction of {data_name}.")
[/usr/local/lib/python3.10/dist-packages/deeponto/align/bertmap/pipeline.py](https://localhost:8080/#) in load_or_construct(self, data_file, data_name, construct_func, *args, **kwargs)
227 else:
228 self.logger.info(f"Construct new {data_name} and save at {data_file}.")
--> 229 construct_func(*args, **kwargs)
230 # load the data file that is supposed to be saved locally
231 return FileUtils.load_file(data_file)
[/usr/local/lib/python3.10/dist-packages/deeponto/align/bertmap/pipeline.py](https://localhost:8080/#) in construct()
241
242 def construct():
--> 243 corpora = TextSemanticsCorpora(
244 src_onto=self.src_onto,
245 tgt_onto=self.tgt_onto,
[/usr/local/lib/python3.10/dist-packages/deeponto/align/bertmap/text_semantics.py](https://localhost:8080/#) in __init__(self, src_onto, tgt_onto, annotation_property_iris, class_mappings, auxiliary_ontos)
517 # build intra-ontology corpora
518 # negative sample ratios are by default
--> 519 self.intra_src_onto_corpus = IntraOntologyTextSemanticsCorpus(src_onto, annotation_property_iris)
520 self.add_samples_from_sub_corpus(self.intra_src_onto_corpus)
521 self.intra_tgt_onto_corpus = IntraOntologyTextSemanticsCorpus(tgt_onto, annotation_property_iris)
[/usr/local/lib/python3.10/dist-packages/deeponto/align/bertmap/text_semantics.py](https://localhost:8080/#) in __init__(self, onto, annotation_property_iris, soft_negative_ratio, hard_negative_ratio)
310 self.onto = onto
311 # $\textsf{BERTMap}$ does not apply synonym transitivity
--> 312 self.thesaurus = AnnotationThesaurus(onto, annotation_property_iris, apply_transitivity=False)
313
314 self.synonyms = self.thesaurus.synonym_sampling()
[/usr/local/lib/python3.10/dist-packages/deeponto/align/bertmap/text_semantics.py](https://localhost:8080/#) in __init__(self, onto, annotation_property_iris, apply_transitivity)
74 self.annotation_property_iris = iris
75 total_number_of_annotations = sum([len(v) for v in self.annotation_index.values()])
---> 76 self.average_number_of_annotations_per_class = total_number_of_annotations / len(self.annotation_index)
77
78 # synonym groups
ZeroDivisionError: division by zero
Following the stack trace I see that the code uses the length of self.annotation_index
as denominator, but apparently this length is zero. This is a dictionary built by Ontology::build_annotation_index()
based on annotation_property_iris
, which as can be seen above is correctly populated and not empty. So I suspect the bug is located somewhere in this function, but I wasn't able to understand exactly where.
Desktop (please complete the following information):
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.