Giter VIP home page Giter VIP logo

agdistis's People

Contributors

diegomoussallem avatar dobraczka avatar earthquakesan avatar firmao avatar lorenzbuehmann avatar lukasbluebaum avatar michaelroeder avatar ricardousbeck avatar seondong avatar vdanielupb avatar wencanluo avatar yamalight avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

agdistis's Issues

German DBpedia 2016-10

Using the following sentence:
<entity>Angela Merkel</entity> was in <entity>Germany</entity>

AGDISTIS returns the following in FOX:
scms:means http://de.dbpedia.org/resource/Angela_Merici ;
scms:source source:fox ;
ann:body "Angela Merkel"^^xsd:string
scms:means http://de.dbpedia.org/resource/Germany_Schulz ;
scms:source source:fox ;
ann:body "Germany"^^xsd:string

Please fix that and write a unit test for it.

Docker and log files

please make sure that the newly deployed docker container on nc9 are persistent so we can use them for statistics

Ubuntu Linux on an AWS EC2 instance works but Mac not

Steps to replicate problem:

Cloned https://github.com/AKSW/AGDISTIS (master branch)
Downloaded English index files:
2014: http://titan.informatik.uni-leipzig.de/rusbeck/agdistis/en/indexdbpedia_en_2014.7z
2016: wget http://titan.informatik.uni-leipzig.de/dmoussallem/dbpedia_index/en/indexdbpedia_en_2016.zip

Set config file src/main/resources/config/agdistis.properties so the 'index' property points to the directory that holds the extracted files. (We tried with both 2014 and then 2016)

Started AGDISTIS with 'mvn tomcat:run'. If one tests it it returns a correct response for this:
curl --data-urlencode "text='The <entity>University of Leipzig</entity> in <entity>Barack Obama</entity>.'" -d type='agdistis' http://localhost:8080/AGDISTIS

Tested 2510 documents that we send to the local endpoint one by one using a simple python script. The files are UTF-8 plain text with named entities enclosed by ... . There is an archive attached containing first 129 documents.

What happens is that some documents are processed perfectly, but after one point the server starts responding "Internal Server Error" (an HTML template) and then after some documents the endpoint stops responding completely.
The service doesn't always first fail at the same document. Sometimes it fails at a document it managed to process at a previous run (after restarting).

There's also one document that crashes the server every time, right away: adidas.001.d-6mTcK2mUsS8HKwz9Fyk2Z5ZDE.txt (see test_docs.zip)

We tried debugging the source code while processing this single file. We found that the code always crashes at the same line (see exception in the log) but always while processing a different random named entity in the text.

I'm using OS X 10.12.3 and Java 1.8.0_92. One more thing, I had to do "ulimit -n 10000" before running the server because without it, the server always crashes with a "Too many open files" exception (different from the exceptions happening otherwise).

test_docs.zip

Strange behaviour with properties files

Sorry in advance, if this question is too java-specific for the issues tracker. But I am having problems with the provided properties file.

I want to use AGDISTIS as a dependency in my Maven project, so I have added AGDISTIS as a JAR file in my local maven repository. In order to make the properties file still accessible, I relocated it in my classpath under /my-program/src/main/resources/agdistis-config/agdistis.properties instead of /my-program/local-repository/org/aksw/AGDISTIS/0.4.0/AGDISTIS-0.4.0.jar/config/agdistis.properties.

I also changed the code in every class where the properties file is accessed, for example:

public NEDAlgo_HITS() throws IOException {

    Properties prop = new Properties();
    InputStream input = NEDAlgo_HITS.class.getResourceAsStream("/agdistis-config/agdistis.properties");
    prop.load(input);

    ...

}

Before adding the JAR to my local repository, I even deleted the whole config folder in AGDISTIS to prevent some kind of fallback and to force the use of my own properties file outside of AGDISTIS.

Now comes the strange part: I can run my program, but it does neither use my new properties file nor the old one (because I have deleted them). It uses some kind of default values for AGDISTIS, which appear as soons as the Properties object is created. This is not desired, of course.

So, how can I extract the properties file out of the JAR properly? In other words: How can I configure AGDISTIS from my own program with AGDISTIS as a dependency?

Any help is very appreciated. :)

Run your own webservice

% mvn tomcat:run
[INFO] Scanning for projects...
[WARNING]
[WARNING] Some problems were encountered while building the effective model for org.aksw:AGDISTIS:jar:0.0.1-SNAPSHOT
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-compiler-plugin is missing. @ line 12, column 12
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-jar-plugin is missing. @ line 51, column 12
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-source-plugin is missing. @ line 39, column 12
[WARNING]
[WARNING] It is highly recommended to fix these problems because they threaten the stability of your build.
[WARNING]
[WARNING] For this reason, future Maven versions might no longer support building such malformed projects.
[WARNING]
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building AGDISTIS 0.0.1-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] >>> tomcat-maven-plugin:1.1:run (default-cli) @ AGDISTIS >>>
Downloading: http://maven.eclipse.org/nexus/content/groups/maven-central/org/aksw/autosparql/commons/1.0-SNAPSHOT/maven-metadata.xml
Jun 14, 2014 5:16:48 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry
INFO: I/O exception (java.net.ConnectException) caught when processing request: Connection timed out
Jun 14, 2014 5:16:48 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry
INFO: Retrying request
Jun 14, 2014 5:18:55 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry
INFO: I/O exception (java.net.ConnectException) caught when processing request: Connection timed out
Jun 14, 2014 5:18:55 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry
INFO: Retrying request

Semantic Analysis using AGDISTIS

Hi,
I need to semantically analyse the following sentences

  1. Adams is a famous singer
  2. Adams is a famous Chess player
    When i input sentence 1. Adams should give dbpedia urls corresponding to Bryan Adams the famous singer and when sentence 2 is inputted Adams should give urls corresponding Michael Adams the famous Chess player.Is it possible to achieve such a usecase with AGDISTIS?Are there any other frameworks through which I can achieve the same?
    Regards
    suryahub

How to work with plain text input in AGDISTIS

Hi,
In the AGDISTIS sample code a preannoated text is given as input with the entities embedded inside entity Tags.Is there an option to give a plain text as input to AGDISTIS which automatically recognises entities and then do the disambiguation?I have heard about FOX.If the same can be done using FOX how do I integrate FOX with AGDISTIS?

Regards
suryahub

Clarify index2 naming in config

Currently in config we have two following fields:

index=index
index2=index_bycontext

From name it is not clear what is index2 and how it's used.
I'd suggest doing the following:

  • (short term) add comment to config file clarifying what it does
  • (long term) rename it to reflect its role

Disambiguated URL is null

Cloned the following project from github
https://github.com/AKSW/AGDISTIS and made configurations according to the following link

https://github.com/AKSW/AGDISTIS/wiki/3-Running-the-webservice

When tried to run the code used in AGIDISTISTest for the following Preannotated Text,disambiguated URLs are null.
Text
Barack Obama visits Merkel in Berlin.

Please see the console
16:21:23,140 INFO [org.aksw.agdistis.algorithm.NEDAlgo_HITS] 64 -
16:21:23,141 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 70 - < Label: Barack Obama>
16:21:23,196 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 132 - < number of candidates before type reduction: 0>
16:21:23,197 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 214 - <Size of Reduced Candidates 0>
16:21:23,197 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 138 - < number of candidates after type reduction: 0>
16:21:23,197 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 143 - < No candidates for: Barack Obama>
Searching this label in surface forms..Barack Obama
16:21:23,277 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 132 - < number of candidates before type reduction: 697>
16:21:23,401 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 214 - <Size of Reduced Candidates 0>
16:21:23,401 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 138 - < number of candidates after type reduction: 0>
16:21:23,401 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 143 - < No candidates for: Barack Obama>
16:21:23,401 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 78 - < Graph size: 0 took: 260 ms>
16:21:23,401 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 70 - < Label: Berlin>
16:21:23,402 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 132 - < number of candidates before type reduction: 0>
16:21:23,402 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 214 - <Size of Reduced Candidates 0>
16:21:23,404 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 138 - < number of candidates after type reduction: 0>
16:21:23,404 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 143 - < No candidates for: Berlin>
Searching this label in surface forms..Berlin
16:21:23,428 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 132 - < number of candidates before type reduction: 1000>
16:21:23,549 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 214 - <Size of Reduced Candidates 0>
16:21:23,549 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 138 - < number of candidates after type reduction: 0>
16:21:23,551 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 143 - < No candidates for: Berlin>
16:21:23,551 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 78 - < Graph size: 0 took: 150 ms>
16:21:23,551 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 70 - < Label: Merkel>
16:21:23,552 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 132 - < number of candidates before type reduction: 0>
16:21:23,552 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 214 - <Size of Reduced Candidates 0>
16:21:23,552 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 138 - < number of candidates after type reduction: 0>
16:21:23,552 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 143 - < No candidates for: Merkel>
Searching this label in surface forms..Merkel
16:21:23,557 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 132 - < number of candidates before type reduction: 93>
16:21:23,562 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 214 - <Size of Reduced Candidates 0>
16:21:23,562 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 138 - < number of candidates after type reduction: 0>
16:21:23,562 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 143 - < No candidates for: Merkel>
16:21:23,562 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 78 - < Graph size: 0 took: 11 ms>
16:21:23,562 INFO [org.aksw.agdistis.algorithm.NEDAlgo_HITS] 69 - < Graph size before BFS: 0>
16:21:23,563 INFO [org.aksw.agdistis.algorithm.NEDAlgo_HITS] 72 - < Graph size after BFS: 0>
Merkel -> null
Barack Obama -> null
Berlin -> null

Build Failure with new 2016 index

Hi,

I have cloned the repository and tried to compile AGDISTIS in conjunction with the newly released 2016 index. So, while running mvn clean install the testUmlaute method failed. If I try it with the old 2014 index, everything will compile just fine and all the tests are passed.


T E S T S

Running TripleIndexTest
16:27:00,757 INFO [org.aksw.agdistis.util.TripleIndex] 56 -
16:27:01,159 INFO [org.aksw.agdistis.util.TripleIndex] 56 -
16:27:01,226 INFO [org.aksw.agdistis.util.TripleIndex] 56 -
16:27:01,290 INFO [org.aksw.agdistis.util.TripleIndex] 56 -
16:27:01,356 INFO [org.aksw.agdistis.util.TripleIndex] 56 -
16:27:01,418 INFO [org.aksw.agdistis.util.TripleIndex] 56 -
16:27:01,756 INFO [org.aksw.agdistis.util.TripleIndex] 56 -
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.238 sec
Running AGDISTISTest
16:27:01,826 INFO [org.aksw.agdistis.util.TripleIndex] 56 -
16:27:01,869 INFO [org.aksw.agdistis.webapp.GetDisambiguation] 103 - < Text: Masaaki_Ōsumi works in Japan.>
16:27:01,875 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 69 - < Label: Masaaki_Ōsumi>
16:27:01,976 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 129 - < number of candidates before type reduction: 137>
16:27:02,022 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 129 - < number of candidates before type reduction: 328>
16:27:02,046 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 77 - < Graph size: 0 took: 171 ms>
16:27:02,046 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 69 - < Label: Japan>
16:27:02,155 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 129 - < number of candidates before type reduction: 1000>
16:27:02,197 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 77 - < Graph size: 1 took: 151 ms>
16:27:02,207 INFO [org.aksw.agdistis.algorithm.NEDAlgo_HITS] 67 - < Graph size before BFS: 1>
16:27:02,288 INFO [org.aksw.agdistis.algorithm.NEDAlgo_HITS] 70 - < Graph size after BFS: 105>
Japan -> http://dbpedia.org/resource/Japan
Masaaki_Ōsumi -> null
16:27:02,327 INFO [org.aksw.agdistis.util.TripleIndex] 56 -
16:27:02,369 INFO [org.aksw.agdistis.webapp.GetDisambiguation] 103 - < Text: Barack Obama visits Angela Merkel in Berlin.>
16:27:02,370 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 69 - < Label: Angela Merkel>
16:27:02,506 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 129 - < number of candidates before type reduction: 1000>
16:27:02,559 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 77 - < Graph size: 1 took: 189 ms>
16:27:02,560 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 69 - < Label: Barack Obama>
16:27:02,640 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 129 - < number of candidates before type reduction: 1000>
16:27:02,761 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 77 - < Graph size: 3 took: 201 ms>
16:27:02,761 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 69 - < Label: Berlin>
16:27:02,846 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 129 - < number of candidates before type reduction: 1000>
16:27:02,880 INFO [org.aksw.agdistis.algorithm.CandidateUtil] 77 - < Graph size: 6 took: 118 ms>
16:27:02,881 INFO [org.aksw.agdistis.algorithm.NEDAlgo_HITS] 67 - < Graph size before BFS: 6>
16:27:03,137 INFO [org.aksw.agdistis.algorithm.NEDAlgo_HITS] 70 - < Graph size after BFS: 443>
Berlin -> http://dbpedia.org/resource/Berlin
Angela Merkel -> http://dbpedia.org/resource/Angela_Merkel
Barack Obama -> http://dbpedia.org/resource/Barack_Obama
Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.376 sec <<< FAILURE!
testUmlaute(AGDISTISTest) Time elapsed: 0.486 sec <<< FAILURE!
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at AGDISTISTest.testUmlaute(AGDISTISTest.java:41)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)

Running HitsTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.033 sec
Running PageRankTest
Master:0 H: 1 A: 1 PR: 0,88 ( = 1/5 * 0.15 + 0.85)
Slave1:0 H: 1 A: 1 PR: 0,03 ( = 1/5 * 0.15)
Slave2:0 H: 1 A: 1 PR: 0,03 (usw.)
Slave3:0 H: 1 A: 1 PR: 0,03
Slave4:0 H: 1 A: 1 PR: 0,03
EQUAL1: N1:0 H: 1 A: 1 PR: 0,25
EQUAL2: N1:0 H: 1 A: 1 PR: 0,25
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.023 sec

Results :

Failed tests: testUmlaute(AGDISTISTest)

Tests run: 13, Failures: 1, Errors: 0, Skipped: 0

[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 6.934 s
[INFO] Finished at: 2017-01-12T16:27:03+01:00
[INFO] Final Memory: 25M/281M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12.4:test (default-test) on project AGDISTIS: There are test failures.
[ERROR]
[ERROR] Please refer to /home/ph/Downloads/AGDISTIS/target/surefire-reports for the individual test results.
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException

Best regards

Environment variables instead of properties

I followed the "running with docker" instructions on the following wiki page:
https://github.com/dice-group/AGDISTIS/wiki/3-Running-the-webservice

I downloaded and unpacked the following German DBPedia file:
dbpedia_index_2016-04/de/indexdbpedia_de_2016.zip
and renamed the folder to "index".

I then ran the following commands:

docker pull aksw/agdistis
docker run -d -p 8080:8080 -v `pwd`/index:/usr/local/tomcat/index aksw/agdistis
docker start <container_name>

When I query the webservice using the following command:
curl --data-urlencode "text='<entity>Angela Merkel</entity>.'" -d type='agdistis' http://localhost:8080/AGDISTIS

I receive what look to be dummy URLs, suggesting the entity cannot be found in the index:
[{"disambiguatedURL":"http:\/\/aksw.org\/notInWiki\/AngelaMerkel","offset":13,"namedEntity":"Angela Merkel","start":1}]

I tried a number of entities that I would expect to return a URL, all without success. I receive the same dummy URL results if I point the docker container at the 2014 German index. However, I can return results for English (using the 2016 and 2014 indices):
[{"disambiguatedURL":"http:\/\/dbpedia.org\/resource\/Angela_Merkel","offset":13,"namedEntity":"Angela Merkel","start":1}]

Are you able to advise whether there is a problem/difference with the German index, or perhaps with my execution of the steps for German?

(assigned also to @mejohnee)

Links to things

Agdistis links Mcnabb Inc to dbr:McNabb that is of type Thing in DBpedia.

Minimal Ontology Example not working

Given the knowledge base below:

http://fairhair.ai/kg/resource/Evertec http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/ontology/Company .
http://fairhair.ai/kg/resource/Evertec http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/ontology/Organisation .
http://fairhair.ai/kg/resource/Evertec http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://schema.org/Organization .
http://fairhair.ai/kg/resource/Evertec http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#SocialPerson .
http://fairhair.ai/kg/resource/Evertec http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.wikidata.org/entity/Q43229 .
http://fairhair.ai/kg/resource/Evertec http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://dbpedia.org/ontology/Agent .
http://fairhair.ai/kg/resource/Evertec http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#Agent .
http://fairhair.ai/kg/resource/Evertec http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/2002/07/owl#Thing .
http://fairhair.ai/kg/resource/Evertec http://www.w3.org/2000/01/rdf-schema#label "Evertec"@en .
http://fairhair.ai/kg/resource/Evertec http://dbpedia.org/ontology/address "Cupey Center Building Road 176"@en .
http://fairhair.ai/kg/resource/Evertec http://dbpedia.org/ontology/numberOfEmployees "1660"^^http://www.w3.org/2001/XMLSchema#nonNegativeInteger .
http://fairhair.ai/kg/resource/Evertec http://dbpedia.org/ontology/locationCity http://dbpedia.org/resource/San_Juan,_Puerto_Rico .
http://fairhair.ai/kg/resource/Evertec http://dbpedia.org/ontology/location http://dbpedia.org/resource/Puerto_Rico .
http://fairhair.ai/kg/resource/Evertec http://dbpedia.org/ontology/locationCountry http://dbpedia.org/resource/United_States .
http://fairhair.ai/kg/resource/Evertec http://dbpedia.org/ontology/industry http://dbpedia.org/resource/Information_technology .
http://fairhair.ai/kg/resource/Evertec http://dbpedia.org/ontology/foundingYear "2004"^^http://www.w3.org/2001/XMLSchema#gYear .

AGDISTIS should be to disambiguate the following sentence 'Evertec is a company in Puerto Rico.'

Currently it returns the "don't know" links for both entities.

Please write a unit test for it.

Wikidata

Create an index for disambiguation to wikidata en

"laparotomy" and "surgeon"

Hi Ricardo,

Just to mention there might be something wrong with AGDISTIS.

I tried a sentence with "laparotomy" and "surgeon" and it did not provide a disambiguation (I put both terms in square brackets).

Greetings,

Denis

BUILD FAILURE of mvn tomcat:run

[WARNING]
[WARNING] Some problems were encountered while building the effective model for org.aksw:AGDISTIS:war:0.0.1-SNAPSHOT
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-compiler-plugin is missing. @ line 12, column 12
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-jar-plugin is missing. @ line 51, column 12
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-source-plugin is missing. @ line 39, column 12
[WARNING]
[WARNING] It is highly recommended to fix these problems because they threaten the stability of your build.
[WARNING]
[WARNING] For this reason, future Maven versions might no longer support building such malformed projects.
[...]

[INFO] Running war on http://localhost:8080/AGDISTIS
[INFO] Creating Tomcat server configuration at /home/USER/tutorial_workspace/AGDISTIS/target/tomcat
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 6:13.210s
[INFO] Finished at: Mon Jul 28 12:13:24 CEST 2014
[INFO] Final Memory: 26M/148M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.codehaus.mojo:tomcat-maven-plugin:1.1:run (default-cli) on project AGDISTIS: Could not start Tomcat: Protocol handler initialization failed: java.net.BindException: Die Adresse wird bereits verwendet :8080 -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
INFO] Scanning for projects...
[WARNING]
[WARNING] Some problems were encountered while building the effective model for org.aksw:AGDISTIS:war:0.0.1-SNAPSHOT
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-compiler-plugin is missing. @ line 12, column 12
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-jar-plugin is missing. @ line 51, column 12
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-source-plugin is missing. @ line 39, column 12
[WARNING]
[WARNING] It is highly recommended to fix these problems because they threaten the stability of your build.
[WARNING]
[WARNING] For this reason, future Maven versions might no longer support building such malformed projects.
[WARNING]
Downloading: https://raw.github.com/numberformat/20130213/master/repo/org/apache/maven/plugins/maven-source-plugin/maven-metadata.xml
Downloading: https://raw.github.com/numberformat/20130213/master/repo/org/apache/maven/plugins/maven-jar-plugin/maven-metadata.xml
Downloading: https://raw.github.com/numberformat/20130213/master/repo/github/numberformat/blog-plugin/1.0-SNAPSHOT/maven-metadata.xml
Downloading: https://raw.github.com/numberformat/20130213/master/repo/org/codehaus/mojo/maven-metadata.xml
Downloading: https://raw.github.com/numberformat/20130213/master/repo/org/apache/maven/plugins/maven-metadata.xml
Downloading: https://raw.github.com/numberformat/20130213/master/repo/org/codehaus/mojo/tomcat-maven-plugin/maven-metadata.xml
Jul 28, 2014 12:13:24 PM org.apache.catalina.startup.Embedded start
Information: Starting tomcat server
Jul 28, 2014 12:13:24 PM org.apache.catalina.core.StandardEngine start
Information: Starting Servlet Engine: Apache Tomcat/6.0.29
Jul 28, 2014 12:13:24 PM org.apache.coyote.http11.Http11Protocol init
Schwerwiegend: Error initializing endpoint
java.net.BindException: Die Adresse wird bereits verwendet :8080
at org.apache.tomcat.util.net.JIoEndpoint.init(JIoEndpoint.java:549)
at org.apache.coyote.http11.Http11Protocol.init(Http11Protocol.java:176)
at org.apache.catalina.connector.Connector.initialize(Connector.java:1014)
at org.apache.catalina.startup.Embedded.start(Embedded.java:830)
at org.codehaus.mojo.tomcat.AbstractRunMojo.startContainer(AbstractRunMojo.java:558)
at org.codehaus.mojo.tomcat.AbstractRunMojo.execute(AbstractRunMojo.java:255)
at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:101)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:209)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)

Handling entities with punctuations

When I run a Chinese webservice using Java and query it I get:

curl --data-urlencode "text='<entity>???</entity>.'" -d type='agdistis' http://localhost:8080/AGDISTIS
I get:

[{"disambiguatedURL":"http:\/\/aksw.org\/notInWiki\/???","offset":3,"namedEntity":"???","start":1}]
And in the terminal window where the webservice is running I see an error:

17:31:08,377 ERROR [org.aksw.agdistis.util.TripleIndex] 143 - <Cannot parse '': Encountered "<EOF>" at line 1, column 0.
Was expecting one of:
   <NOT> ...
   "+" ...
   "-" ...
   <BAREOPER> ...
   "(" ...
   "*" ...
   <QUOTED> ...
   <TERM> ...
   <PREFIXTERM> ...
   <WILDTERM> ...
   <REGEXPTERM> ...
   "[" ...
   "{" ...
   <NUMBER> ...
   <TERM> ...
   "*" ...
    -> null>
Oct 05, 2017 5:31:08 PM org.restlet.engine.log.LogFilter afterHandle
INFO: 2017-10-05	17:31:08	0:0:0:0:0:0:0:1	-	0:0:0:0:0:0:0:18080	POST	/AGDISTIS	-	200	111	80	31	http://localhost:8080	curl/7.54.0	-

Word2Vec

talk to Michael Hoff. about how to compute it through many candidates

[400 Error] Python and CURL requests

Good afternoon everyone,

I am a new user and it seems that the API does not work when called from the official package agdistispy (400 error).

Also, the following request provides ERR_INVALID_URL :
curl --data-urlencode "text='Barack Obama arrives in Washington, D.C.'" -d type='agdistis' http://139.18.2.164:8080/AGDISTIS

I am located in France, but the API used to work yesterday.
Is it possible to tackle the issue?

Thanks in advance,

Sammy Khalife

Converting Natural Language to Database Queries

Hi,
Is there an option to integrate a database (SQL or NOSQL)with AGDISTIS and convert the natural language statements to database queries?
Eg:
"invoices paid last year"
which inteprets the above sentence convert it to corresponding database queries and fetches the results from the database.
I found a similar usecase with dbpedia spotlight to convert natural language to SPARQL queries
https://dbpedia-spotlight.github.io/demo/

Is such a thing acheivable with AGDISTIS?

Regards
suryahub

Influence of capitalization

Capitalization seems to have a strong influence on whether an entity can be resolved or not. For example, Leipzig and Germany are resolved to the correct DBpedia resources, whereas the URL for leipzig and germany is null.

Input:

[Leipzig] is the coolest city in [Germany]. 

Output:

{
  "namedEntities": [
    {
      "namedEntity": "Germany",
      "start": 32,
      "end": 39,
      "offset": 7,
      "disambiguatedURL": "http://dbpedia.org/resource/Germany"
    },
    {
      "namedEntity": "Leipzig",
      "start": 1,
      "end": 8,
      "offset": 7,
      "disambiguatedURL": "http://dbpedia.org/resource/Leipzig"
    }
  ],
  "detectedLanguage": "en",
  "$promise": {},
  "$resolved": true
}

Input:

[leipzig] is the coolest city in [germany]. 

Output:

{
  "namedEntities": [
    {
      "namedEntity": "leipzig",
      "start": 1,
      "end": 8,
      "offset": 7,
      "disambiguatedURL": null
    },
    {
      "namedEntity": "germany",
      "start": 32,
      "end": 39,
      "offset": 7,
      "disambiguatedURL": null
    }
  ],
  "detectedLanguage": "en",
  "$promise": {},
  "$resolved": true
}

localhost:8080 returns a Blank Page

Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

>mvn tomcat:run

[INFO] Scanning for projects...
[WARNING]
[WARNING] Some problems were encountered while building the effective model for
org.aksw:AGDISTIS:war:0.4.0
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-comp
iler-plugin is missing. @ line 12, column 12
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-sour
ce-plugin is missing. @ line 41, column 12
[WARNING]
[WARNING] It is highly recommended to fix these problems because they threaten t
he stability of your build.
[WARNING]
[WARNING] For this reason, future Maven versions might no longer support buildin
g such malformed projects.
[WARNING]
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building AGDISTIS 0.4.0
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] >>> tomcat-maven-plugin:1.1:run (default-cli) > compile @ AGDISTIS >>>
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ AGDISTIS -
--
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 9 resources
[INFO]
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ AGDISTIS ---
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] <<< tomcat-maven-plugin:1.1:run (default-cli) < compile @ AGDISTIS <<<
[INFO]
[INFO] --- tomcat-maven-plugin:1.1:run (default-cli) @ AGDISTIS ---
[INFO] Running war on http://localhost:8080/AGDISTIS
[INFO] Using existing Tomcat server configuration at C:\Users\zaki\Documents\Git
Hub\AGDISTIS\AGDISTIS\target\tomcat
Jun 25, 2015 3:00:41 PM org.apache.catalina.startup.Embedded start
INFO: Starting tomcat server
Jun 25, 2015 3:00:41 PM org.apache.catalina.core.StandardEngine start
INFO: Starting Servlet Engine: Apache Tomcat/6.0.29
Jun 25, 2015 3:00:41 PM org.apache.coyote.http11.Http11Protocol init
INFO: Initializing Coyote HTTP/1.1 on http-8080
Jun 25, 2015 3:00:41 PM org.apache.coyote.http11.Http11Protocol start
INFO: Starting Coyote HTTP/1.1 on http-8080

I'm not sure where is the problem, I would appreciate any help. Thank you.

Khiati Z. Abdel-ilah

AGDISTIS fails with own index

Hi everyone,

right now I am trying to get AGDISTIS to work with an index different from dbpedia. For test purposes, I have created a tiny custom index and run AGDISTIS on it. But it does not return a proper URI of the disambiguated entity. It just returns the text/label of the entity instead.

My approach so far:

  • Create three files: labels_en.ttl, instance_types_en.ttl and en_surface_forms.tsv. I have oriented myself to your DBpedia 2014 index example from the wiki. They look like this:

labels_en.ttl:
<http://www.technologyreview.com/s/602283> <http://www.w3.org/2000/01/rdf-schema#label> "QuantumComputer"@en .

instance_types_en.ttl:
<http://www.technologyreview.com/s/602283> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/InformationAppliance> .

en_surface_forms.tsv:
http://www.technologyreview.com/s/602283 Quantum Computer Computer

  • Run mvn exec:java -Dexec.mainClass="org.aksw.agdistis.util.TripleIndexCreator" to create the actual index.

  • Modify properties file:

      nodeType=http://www.technologyreview.com/s/
      edgeType=http://dbpedia.org/ontology/
      baseURI =http://www.technologyreview.com/s
      threshholdTrigram=0.5
    
  • Run AGDISTIS (mergeT branch) with the following code (it gets the entity labels and positions from the Stanford CoreNLP MentionsAnnotator):

      public void disambiguateEntities() throws InterruptedException, IOException {
          NEDAlgo_HITS agdistis = new NEDAlgo_HITS();
          Document agdistisDocument = new Document();
          ArrayList<NamedEntityInText> entityList = new ArrayList<NamedEntityInText>();
    
          for (final CoreMap sentence : document.get(CoreAnnotations.SentencesAnnotation.class)) {
              for (final CoreMap entityMention : sentence.get(CoreAnnotations.MentionsAnnotation.class)) {
                  entityList.add(
                      new NamedEntityInText(entityMention.get(CoreAnnotations.CharacterOffsetBeginAnnotation.class),
                              entityMention.get(CoreAnnotations.TextAnnotation.class).length(),
                              entityMention.get(CoreAnnotations.TextAnnotation.class)));
    
              }
          }
    
          NamedEntitiesInText namedEntitiesInText = new NamedEntitiesInText(entityList);
          DocumentText documentText = new DocumentText(text);
    
          agdistisDocument.addText(documentText);
          agdistisDocument.addNamedEntitiesInText(namedEntitiesInText);
    
          agdistis.run(agdistisDocument, null);
    
          NamedEntitiesInText namedEntities = agdistisDocument.getNamedEntitiesInText();
    
          for (NamedEntityInText namedEntity : namedEntities) {
              String disambiguatedURL = namedEntity.getNamedEntityUri();
              this.results.put(namedEntity.getStartPos(), disambiguatedURL);
          }
      }
    

Now instead of returning QuantumComputer -> http://www.technologyreview.com/s/602283 it returns QuantumComputer -> QuantumComputer.

Is this an issue with my custom index? Because if I use the 2016 dbpedia standard index my implementation is working.

I would be very happy if you could provide an explanation of how to use a custom index that is a little bit more detailed than in the GitHub wiki. :-)

Thank you in advance!

PS: In case the new custom index will be working in the future, how can I add new Triples to the already existing index? With the addDocumentToIndex method in TripleIndexCreator.java?

Handle more than one URI

make AGDISTIS be able to disambiguate more than one node type and also go through more than edge type.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.