Giter VIP home page Giter VIP logo

ollie's Introduction

Ollie

Ollie is a program that automatically identifies and extracts binary relationships from English sentences. Ollie is designed for Web-scale information extraction, where target relations are not specified in advance.

Ollie is our second-generation information extraction system . Whereas ReVerb operates on flat sequences of tokens, Ollie works with the tree-like (graph with only small cycles) representation using Stanford's compression of the dependencies. This allows Ollie to capture expression that ReVerb misses, such as long-range relations.

Ollie also captures context that modifies a binary relation. Presently Ollie handles attribution (He said/she believes) and enabling conditions (if X then).

Quick Start

Docker

You can now run Ollie with a single Docker command.

docker run -it schmmd/ollie:latest

To configure Ollie, you can drop into a bash shell with docker run -it schmmd/ollie:latest /bin/bash and run Ollie from the command line.

Local Machine

If you want to run Ollie on a small amount of text without modifying the source code, you can use an executable file that can be run from the command line. Please note that Ollie was built using Scala 2.9 and so it requires Java 7. Follow these steps to get started:

  1. Download the latest Ollie binary from http://knowitall.cs.washington.edu/ollie/ollie-app-latest.jar.

  2. Download the linear English MaltParser model (engmalt.linear-1.7.mco) from http://www.maltparser.org/mco/english_parser/engmalt.html and place it in the same directory as Ollie.

  3. Run java -Xmx512m -jar ollie-app-latest.jar yourfile.txt. The input file should contain one sentence per line unless --split is specified. Omit the input file for an interactive console.

Examples

Enabling Condition

An enabling condition is a condition that needs to be met for the extraction to be true. Certain words demark an enabling condition, such as "if" and "when". Ollie captures enabling conditions if they are present.

sentence: If I slept past noon, I'd be late for work.
extraction: (I; 'd be late for; work)[enabler=If I slept past noon]

Attribution

An attribution clause specifies an entity that asserted an extraction and a verb that specifies the expression. Ollie captures attributions if they are present.

sentence: Some people say Barack Obama was not born in the United States.
extraction: (Barack Obama; was not born in; the United States)[attrib=Some people say]

sentence: Early astronomers believe that the earth is the center of the universe.
extraction: (the earth; is the center of; the universe)[attrib=Early astronomers believe]

Relational noun

Some relations are expressed without verbs. Ollie can capture these as well as verb-mediated relations.

sentence: Microsoft co-founder Bill Gates spoke at a conference on Monday.
extraction: (Bill Gates; be co-founder of; Microsoft)

N-ary extractions

Often times similar relations will specify different aspects of the same event. Since Ollie captures long-range relations it can capture N-ary extractions by collapsing extractions where the relation phrase only differs by the preposition.

sentence: I learned that the 2012 Sasquatch music festival is scheduled for May 25th until May 28th.
extraction: (the 2012 Sasquatch music festival; is scheduled for; May 25th)
extraction: (the 2012 Sasquatch music festival; is scheduled until; May 28th)
nary: (the 2012 Sasquatch music festival; is scheduled; [for May 25th; to May 28th])

Building

Building Ollie from source requires Apache Maven (http://maven.apache.org). First, clone or download the Ollie source from GitHub. Run this command in the top-level source folder to download the required dependencies, compile, and create a single jar file.

mvn clean package

The compiled class files will be put in the base directory. The single executable jar file will be written to ollie-app-VERSION.jar where VERSION is the version number.

Command Line Interface

Once you have built Ollie, you can run it from the command line.

java -Xmx512m -jar ollie-app-VERSION.jar yourfile.txt

Omit the input file for an interactive console.

Ollie takes sentences, one-per-line as input or splits text into sentences if --split is specified. Run Ollie with --usage to see full usage.

The Ollie command line tool has a few output formats. The output format is specified by --output-format and a valid format:

  1. The interactive format that is meant to be easily human readable.
  2. The tabbed format is mean to be easily parsable. A header will be output as the first row to label the columns.
  3. tabbedsingle is similar to tabbed but the extraction is output as (arg1; relation; arg2) in a single column.
  4. The serialized is meant to be fully deserialized into an OllieExtractionInstance class.

Graphical Interface

Ollie works ontop of a subcomponent called OpenParse. The distinction is largely technical; OpenParse does not handle attribution and enabling condition and uses a coarser confidence metric. You can use a GUI application to visualize the OpenParse extractions in a parse tree. To use it, you will need to have graphviz installed. You can run the GUI with:

java -Xms512M -Xmx1g -cp ollie-app-VERSION.jar edu.knowitall.openparse.OpenParseGui

By default, this application will look for graphviz's dot program at /usr/bin/dot. You can specify a location with the --graphviz parameter.

You can try out your own models with Options->Load Model.... To see an example model, look at openparse.model in src/main/resources. Your model may have one or more patterns in it. If you want to see pattern matches (without node expansion) instead of triple extractions, you can choose to show the raw match with Options->Raw Matches. This will allow you to use patterns that do not capture an arg1, rel, and arg2.

Parsers

Ollie is packaged to use Malt Parser, one of the fastest dependency parsers available. You will need the model file (engmalt.linear-1.7.mco) in the directory the application is run from or you will need to specify its location with the --malt-model parameter. Malt Parser models are available online.

http://www.maltparser.org/mco/english_parser/engmalt.html

Ollie works with any other parser in the nlptools project. For example, it is easy to swap out Malt for Stanford's parser. Stanford's parser is not a part of the Ollie distribution by default because of licensing conflicts, but the Stanford parser was used as the execution parser for the results in the paper. Malt Parser was used to bootstrap the patterns. We are interested in Clear parser as an alternative, but it's not a trivial change because Clear uses a slightly different dependency representation.

Using Eclipse

To modify the Ollie source code in Eclipse, use the M2Eclipse plugin along with ScalaIDE. You can then import the project using the following.

File > Import > Existing Maven Projects

Including Ollie as a Dependency

Add the following as a Maven dependency.

<groupId>edu.washington.cs.knowitall.ollie</groupId>
<artifactId>ollie-core_2.9.2</artifactId>
<version>[1.0.0, )</version>

The best way to find the latest version is to browse Maven Central.

ollie-core does not include a way to parse sentences. You will need to use a parser supplied by the nlptools project. The source for for ollie-app is an excellent example of a project using ollie-core as a dependency. ollie-app supplies a parser from nlptools.

There is an example project that uses Ollie in the example folder of the source distribution.

Training the Confidence Function

While Ollie comes with a trained confidence function, it is possible to retrain the confidence function. First, you need to run Ollie over a set of sentences and store the output in the serialized format.

echo "Michael rolled down the hill." | java -jar ollie-app-1.0.0-SNAPSHOT.jar --serialized --output toannotate.tsv

Next you need to annotate the extractions. Modify the output file and change the first column to a binary annotation--1 for correct and 0 for wrong. Your final file will look similar to ollie/data/training.tsv. Now run the logistic regression trainer.

java -cp ollie-app-1.0.0-SNAPSHOT.jar edu.washington.cs.knowitall.ollie.confidence.train.TrainOllieConfidence toannotate.tsv

Concurrency

When operating at web scale, parallelism is essential. While the base Ollie extractor is immutable and thread safe, the parser may not be thread safe. I do not know whether Malt parser is thread safe.

FAQ

  1. How fast is Ollie?

    You should really benchmark Ollie yourself, but on my computer (a new computer in 2011), Ollie processed 5000 high-quality web sentences in 56 seconds, or 89 sentences per second, in a single thread. Ollie is easily parallelizable and the Ollie extractor itself is threadsafe (see Concurrency section).

Contact

To contact the UW about Ollie, email [email protected].

Citing Ollie

If you use Ollie in your academic work, please cite Ollie with the following BibTeX citation:

@inproceedings{ollie-emnlp12,
  author = {Mausam and Michael Schmitz and Robert Bart and Stephen Soderland and Oren Etzioni},
  title = {Open Language Learning for Information Extraction},
  booktitle = {Proceedings of Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CONLL)},
  year = {2012}
}

ollie's People

Contributors

mlotstein avatar schmmd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ollie's Issues

Ollie doesn't build correctly

I'm trying to use Ollie with Java, when I run the command mvn clean package

An error happen:

[INFO] Scanning for projects...
[WARNING] 
[WARNING] Some problems were encountered while building the effective model for edu.washington.cs.knowitall.ollie:ollie-core_2.9.2:jar:1.0.4-SNAPSHOT
[WARNING] 'parent.relativePath' points at edu.washington.cs.knowitall.ollie:ollie instead of edu.washington.cs.knowitall:knowitall-oss, please verify your project structure @ line 4, column 11
[WARNING] 
[WARNING] It is highly recommended to fix these problems because they threaten the stability of your build.
[WARNING] 
[WARNING] For this reason, future Maven versions might no longer support building such malformed projects.
[WARNING] 
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Build Order:
[INFO] 
[INFO] ollie-core
[INFO] ollie-app
[INFO] ollie
[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building ollie-core 1.0.4-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ ollie-core_2.9.2 ---
[INFO] 
[INFO] --- maven-enforcer-plugin:1.0:enforce (enforce-maven) @ ollie-core_2.9.2 ---
[INFO] 
[INFO] --- maven-resources-plugin:2.3:resources (default-resources) @ ollie-core_2.9.2 ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 7 resources
[INFO] 
[INFO] --- maven-compiler-plugin:2.5.1:compile (default-compile) @ ollie-core_2.9.2 ---
[INFO] Nothing to compile - all classes are up to date
[INFO] 
[INFO] --- scala-maven-plugin:3.1.1:compile (default) @ ollie-core_2.9.2 ---
[INFO] /home/herlonaguiar/UFGD/LP1/ollie/core/src/main/scala:-1: info: compiling
[INFO] Compiling 36 source files to /home/herlonaguiar/UFGD/LP1/ollie/core/target/classes at 1429067933578
[INFO] prepare-compile in 0 s
[INFO] compile in 33 s
[INFO] 
[INFO] --- maven-resources-plugin:2.3:testResources (default-testResources) @ ollie-core_2.9.2 ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] 
[INFO] --- maven-compiler-plugin:2.5.1:testCompile (default-testCompile) @ ollie-core_2.9.2 ---
[INFO] Nothing to compile - all classes are up to date
[INFO] 
[INFO] --- scala-maven-plugin:3.1.1:testCompile (default) @ ollie-core_2.9.2 ---
[INFO] /home/herlonaguiar/UFGD/LP1/ollie/core/src/test/scala:-1: info: compiling
[INFO] Compiling 8 source files to /home/herlonaguiar/UFGD/LP1/ollie/core/target/test-classes at 1429067967006
[INFO] prepare-compile in 0 s
[INFO] compile in 18 s
[INFO] 
[INFO] --- maven-surefire-plugin:2.10:test (default-test) @ ollie-core_2.9.2 ---
[INFO] Surefire report directory: /home/herlonaguiar/UFGD/LP1/ollie/core/target/surefire-reports

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running edu.knowitall.openparse.OpenParseSpecTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.179 sec
Running edu.knowitall.openparse.PatternExtractorSpecTest
Tests run: 19, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.379 sec
Running edu.knowitall.openparse.BuildPatternsSpecTest
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.234 sec
Running edu.knowitall.openparse.ExtractorPatternSpecTest
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.042 sec
Running edu.knowitall.openparse.OllieSpecTest
Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.261 sec <<< FAILURE!
Running edu.knowitall.ollie.DependencyGraphExtrasTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.053 sec
Running edu.knowitall.common.enrich.TraversableSpecTest
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.053 sec

Results :

Tests in error: 
  Ollie finds an example extraction(edu.knowitall.openparse.OllieSpecTest): Could not tab deserialize: (._._5_37); nsubj(finds_VBZ_1_10, OpenParse_NNP_0_0); dobj(finds_VBZ_1_10, extraction_NN_4_27); det(extraction_NN_4_27, an_DT_2_16); nn(extraction_NN_4_27, example_NN_3_19)     Template        {rel}   {arg1} <nsubj< {rel:postag=VBZ} >dobj> {arg2}   0.1443  OpenParse ;;; OpenParse_NNP_0_0 finds ;;; finds_VBZ_1_10        an example extraction ;;; an_DT_2_16; example_NN_3_19; extraction_NN_4_27       0,14430 None    None

Tests run: 41, Failures: 0, Errors: 1, Skipped: 0

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] ollie-core ........................................ FAILURE [1:00.375s]
[INFO] ollie-app ......................................... SKIPPED
[INFO] ollie ............................................. SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1:00.594s
[INFO] Finished at: Tue Apr 14 23:19:51 AMT 2015
[INFO] Final Memory: 15M/163M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.10:test (default-test) on project ollie-core_2.9.2: There are test failures.
[ERROR] 
[ERROR] Please refer to /home/herlonaguiar/UFGD/LP1/ollie/core/target/surefire-reports for the individual test results.
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException

Relation not expanded over "acomp"

Consider the sentence "She looks beautiful on Thursday." In this case, "beautiful" is connected to "looks" via an acomp edge. We presently don't get (She, looks, beautiful) as a simple extraction or (She, looks beautiful on, Thursday). It probably makes sense to include acomps with the relation. The acomp adjective can be further modified i.e. "very beautiful".

How to make Ollie or OpenIE work with medium size of data in Java project

Hi everyone,

I am trying to use either Ollie or OpenIE to extract knowledge from around 100 - 1,000 ocean science web pages. I have imported OpenIE (4.2) into my Java project through maven. It works well with several sentences, but once it gets to more than 2 pages, I starts to see the "out of memory error". I have set the heap size to -Xmx2700m.

Is there any way to make it work without modifying the source code (I don't know scala)?

BTW, I also tried the Stanford OpenIE as well. Although it doesn't give me any error so far, the triples extracted by it were pretty messy.

Thanks,
Cody

No rel-rel extractions

Ollie does not presently handle relations that have multiple verbs. For example:

(VBD) I tried to tie my shoes.
(VB)  I will try to tie my shoes.
(VBG) I was trying to tie my shoes.
(VBP) They try to tie their shoes.
(VBZ) He tries to tie his shoes.
(VBN) I had tried to tie my shoes.

The patterns that apply look like:

{rel} {prep}    {arg1} <nsubj< {rel1:postag=VB} >xcomp> {rel2:postag=VB} >{prep:regex=prep_(.*)}> {arg2}  
{rel}   {arg1} <nsubj< {rel1:postag=VB} >xcomp> {rel2:postag=VB} >dobj> {arg2}

Where the rel1 postag can be any of {VBZ, VBD, VBG, VBN, VBP, VB}.

Ollie sometimes inserts "be" over conjunction edge

Hi

Having grabbed the latest Ollie JAR which I believe is v1.0.2 I’ve noticed something odd having run a few sentences through it and wondered if you could take a look and confirm whether it’s a bug. I had used an older version before (1.0.0) and ran this sentence through it again to check that version and I don’t see the same oddity.

Here is a simple form of the sentence structure that demonstrates the problem:
The dog has died due to shock and is being buried at home tomorrow.

With Ollie 1.0.0 we get:
0.793: (The dog; has died due to; shock)
0.698: (The dog; is being buried at; home)

But with 1.0.2 we get:
0.673: (The dog; has died is due to; shock)
0.554: (The dog; has is being buried at; home)

Other than the scores changing, its simply the inclusion of the multiple VBZ nodes across the two relations that doesn’t look right. Could you confirm this is unexpected behavior, I couldn’t think of a reason why you might want this to be the case. Sorry if this has already been fixed I haven’t had time to grab the very latest source and build it to test against.

Thanks

Tony

Antony Scerri
Principal Technology Researcher, Elsevier Labs

MaltParser croaks on unicode characters

I need to replace unicode characters with their ascii equivalents. I remember doings this before and referring to this stackoverflow post but I'm not sure where I did this. As an example, the first apostrophe messes up MaltParsers interpretation of the sentence.

President Bashar al-Assad’s forces have resorted to firing ballistic missiles at rebel fighters inside Syria, Obama administration officials said Wednesday, escalating a nearly two-year-old civil war as the government struggles to slow the momentum of a gaining insurgency.

Build fails with Java 1.8.0_92-b14, Mac Sierra

I am trying to build with Java 1.8.0_92-b14. However, it generates following error

[ERROR] error: error while loading CharSequence, class file '/Library/Java/JavaVirtualMachines/jdk1.8.0_92.jdk/Contents/Home/jre/lib/rt.jar(java/lang/CharSequence.class)' is broken
[INFO] (bad constant pool tag 18 at byte 10)

No ccomp attachment with cop relation

Here is an example:

McConnell said he was willing to look at any plan by President Barack Obama to avoid the fiscal cliff and a Senate aide said congressional leaders could hold talks with the president on Friday.

0.75    congressional leaders   could hold  talks
0.57    he  was willing
0.72    the fiscal cliff and a Senate aide  said of congressional leaders

Switch to ClearParser

Switch to clear parser for superior parses with equal scalability. Some patterns will need to be changed because clear represents them differently (better). In particular, there are no cop edges.

What can I do to train a model?

Hello,I want to train a model and I run
java -cp ollie-app-latest.jar edu.knowitall.openparse.BuildPatterns data/lemmagrep.txt data/patterns.txt
lemmagrep.txt is downloaded from http://knowitall.cs.washington.edu/ollie/data/lemmagrep.txt.bz2

but I get an error:

15:20:08.685 [main] INFO e.knowitall.openparse.BuildPatterns$ - chunk size: 100000
15:20:08.689 [main] INFO e.knowitall.openparse.BuildPatterns$ - pattern length: None
15:20:08.832 [main] ERROR e.knowitall.openparse.BuildPatterns$ - could not deserialize graph: 1
edu.knowitall.tool.parse.graph.DependencyGraph$SerializationException: Could not deserialize graph: 1
at edu.knowitall.tool.parse.graph.DependencyGraph$.deserialize(DependencyGraph.scala:606) ~[ollie-app-latest.jar:na]
at edu.knowitall.openparse.BuildPatterns$$anonfun$main$1$$anonfun$3$$anonfun$apply$mcV$sp$1.apply(BuildPatterns.scala:79) [ollie-app-latest.jar:na]
at edu.knowitall.openparse.BuildPatterns$$anonfun$main$1$$anonfun$3$$anonfun$apply$mcV$sp$1.apply(BuildPatterns.scala:73) [ollie-app-latest.jar:na]
at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:59) [ollie-app-latest.jar:na]
at scala.collection.immutable.List.foreach(List.scala:76) [ollie-app-latest.jar:na]
at edu.knowitall.openparse.BuildPatterns$$anonfun$main$1$$anonfun$3.apply$mcV$sp(BuildPatterns.scala:73) [ollie-app-latest.jar:na]
at edu.knowitall.common.Timing$.time(Timing.scala:59) [ollie-app-latest.jar:na]
at edu.knowitall.openparse.BuildPatterns$$anonfun$main$1.apply(BuildPatterns.scala:73) [ollie-app-latest.jar:na]
at edu.knowitall.openparse.BuildPatterns$$anonfun$main$1.apply(BuildPatterns.scala:68) [ollie-app-latest.jar:na]
at scala.collection.Iterator$class.foreach(Iterator.scala:772) [ollie-app-latest.jar:na]
at scala.collection.Iterator$GroupedIterator.foreach(Iterator.scala:907) [ollie-app-latest.jar:na]
at edu.knowitall.openparse.BuildPatterns$.main(BuildPatterns.scala:68) [ollie-app-latest.jar:na]
at edu.knowitall.openparse.BuildPatterns$.main(BuildPatterns.scala:50) [ollie-app-latest.jar:na]
at edu.knowitall.openparse.BuildPatterns.main(BuildPatterns.scala) [ollie-app-latest.jar:na]
Caused by: edu.knowitall.tool.parse.graph.Dependency$SerializationException: could not deserialize dependency: 1
at edu.knowitall.tool.parse.graph.Dependency$.deserialize(Dependency.scala:62) ~[ollie-app-latest.jar:na]
at edu.knowitall.tool.parse.graph.Dependencies$$anonfun$deserialize$1.apply(Dependency.scala:74) ~[ollie-app-latest.jar:na]
at edu.knowitall.tool.parse.graph.Dependencies$$anonfun$deserialize$1.apply(Dependency.scala:74) ~[ollie-app-latest.jar:na]
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:233) ~[ollie-app-latest.jar:na]
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:233) ~[ollie-app-latest.jar:na]
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34) ~[ollie-app-latest.jar:na]
at scala.collection.mutable.ArrayOps.foreach(ArrayOps.scala:38) ~[ollie-app-latest.jar:na]
at scala.collection.TraversableLike$class.map(TraversableLike.scala:233) ~[ollie-app-latest.jar:na]
at scala.collection.mutable.ArrayOps.map(ArrayOps.scala:38) ~[ollie-app-latest.jar:na]
at edu.knowitall.tool.parse.graph.Dependencies$.deserialize(Dependency.scala:74) ~[ollie-app-latest.jar:na]
at edu.knowitall.tool.parse.graph.DependencyGraph$.rec$1(DependencyGraph.scala:596) ~[ollie-app-latest.jar:na]
at edu.knowitall.tool.parse.graph.DependencyGraph$.deserialize(DependencyGraph.scala:601) ~[ollie-app-latest.jar:na]
... 13 common frames omitted
Caused by: scala.MatchError: 1 (of class java.lang.String)
at edu.knowitall.tool.parse.graph.Dependency$.deserialize(Dependency.scala:55) ~[ollie-app-latest.jar:na]
... 24 common frames omitted
15:20:08.859 [main] ERROR e.knowitall.openparse.BuildPatterns$ - could not deserialize graph: 1
edu.knowitall.tool.parse.graph.DependencyGraph$SerializationException: Could not deserialize graph: 1
at edu.knowitall.tool.parse.graph.DependencyGraph$.deserialize(DependencyGraph.scala:606) ~[ollie-app-latest.jar:na]
at edu.knowitall.openparse.BuildPatterns$$anonfun$main$1$$anonfun$3$$anonfun$apply$mcV$sp$1.apply(BuildPatterns.scala:79) [ollie-app-latest.jar:na]
at edu.knowitall.openparse.BuildPatterns$$anonfun$main$1$$anonfun$3$$anonfun$apply$mcV$sp$1.apply(BuildPatterns.scala:73) [ollie-app-latest.jar:na]
at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:59) [ollie-app-latest.jar:na]
at scala.collection.immutable.List.foreach(List.scala:76) [ollie-app-latest.jar:na]
......

Could you tell me what I can do to fix it? thanks!

Auxileraries don't distribute over conj_and

For the first sentence we get (you, like to eat, cherries but for the second we get (you, like eat, cherries).

He says that you like to eat cherries and swim.
He says that you like to swim and eat cherries.

Java Version for Successful Build

Just a heads-up, as of March 2017, on Unix Systems, the Maven Build functionality for the project seems to work only with JDK 1.7
Also, even on successful build, executing it exits with the error:
Loading ollie models... Exception in thread "main" java.lang.reflect.InaccessibleObjectException: Unable to make member of class sun.net.www.protocol.jar.JarURLConnection$JarURLInputStream accessible: module java.base does not export sun.net.www.protocol.jar to unnamed module @533377b
at sun.reflect.Reflection.throwInaccessibleObjectException(java.base@9-internal/Reflection.java:420)
at java.lang.reflect.AccessibleObject.checkCanSetAccessible(java.base@9-internal/AccessibleObject.java:174)
at java.lang.reflect.Method.checkCanSetAccessible(java.base@9-internal/Method.java:189)
at java.lang.reflect.Method.setAccessible(java.base@9-internal/Method.java:183)
at scala.runtime.ScalaRunTime$.ensureAccessible(ScalaRunTime.scala:136)
at edu.knowitall.common.Resource$.reflMethod$Method1(Resource.scala:16)
at edu.knowitall.common.Resource$.using(Resource.scala:16)
at edu.knowitall.openparse.OpenParse$.fromModelUrl(OpenParse.scala:175)
at edu.knowitall.ollie.OllieCli$$anonfun$22.apply(OllieCli.scala:217)
at edu.knowitall.ollie.OllieCli$$anonfun$22.apply(OllieCli.scala:212)
at edu.knowitall.common.Timing$.time(Timing.scala:50)
at edu.knowitall.common.Timing$.timeThen(Timing.scala:72)
at edu.knowitall.ollie.OllieCli$.run(OllieCli.scala:219)
at edu.knowitall.ollie.OllieCli$.main(OllieCli.scala:187)
at edu.knowitall.ollie.OllieCli.main(OllieCli.scala)

My ollie can't work.

I have put the file "engmalt.linear-1.7.mco" in the correct position.

jar:file:/D:/JavaSpace/TweetsOIE/lib/ollie-app-latest.jar!/logback.xml
Initializing malt: -u file:/D:/JavaSpace/TweetsOIE/engmalt.linear-1.7.mco -m parse -cl off
Exception in thread "main"
There was an error configurating MaltParser.
This is most likely because the model file 'file:/D:/JavaSpace/TweetsOIE/engmalt.linear-1.7.mco' was not found.
Please download the MaltParser model file from http://www.maltparser.org.

org.maltparser.core.config.ConfigurationException: The entry 'engmalt.linear-1.7_singlemalt.info' in the mco url 'file:/D:/JavaSpace/TweetsOIE/engmalt.linear-1.7.mco' cannot be loaded.
at org.maltparser.core.config.ConfigurationDir.getInputStreamReaderFromConfigFileEntry(ConfigurationDir.java:391)
at org.maltparser.core.config.ConfigurationDir.initCreatedByMaltParserVersionFromInfoFile(ConfigurationDir.java:1031)
at org.maltparser.core.config.ConfigDirChartItem.initialize(ConfigDirChartItem.java:93)
at org.maltparser.core.flow.FlowChartInstance.initChartItem(FlowChartInstance.java:72)
at org.maltparser.core.flow.FlowChartInstance.(FlowChartInstance.java:53)
at org.maltparser.core.flow.FlowChartManager.initialize(FlowChartManager.java:104)
at org.maltparser.Engine.initialize(Engine.java:45)
at org.maltparser.MaltParserService.initializeParserModel(MaltParserService.java:107)
at edu.knowitall.tool.parse.MaltParser.initializeMaltParserService(MaltParser.scala:55)
at edu.knowitall.tool.parse.MaltParser.(MaltParser.scala:30)
at ollie.JavaOllieWrapper.(JavaOllieWrapper.java:26)
at ollie.JavaOllieWrapper.main(JavaOllieWrapper.java:49)
Caused by: java.util.zip.ZipException: error in opening zip file
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.(ZipFile.java:215)
at java.util.zip.ZipFile.(ZipFile.java:145)
at java.util.jar.JarFile.(JarFile.java:153)
at java.util.jar.JarFile.(JarFile.java:90)
at sun.net.www.protocol.jar.URLJarFile.(URLJarFile.java:93)
at sun.net.www.protocol.jar.URLJarFile.getJarFile(URLJarFile.java:69)
at sun.net.www.protocol.jar.JarFileFactory.get(JarFileFactory.java:94)
at sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:122)
at sun.net.www.protocol.jar.JarURLConnection.getJarFile(JarURLConnection.java:89)
at org.maltparser.core.config.ConfigurationDir.getInputStreamReaderFromConfigFileEntry(ConfigurationDir.java:378)
... 11 more

Nonsensical relation phrases with some remote arg2s.

Ollie produces some incorrect extractions with remote arg2s.

This seems to happen mostly when the remote arg2 is a date or a temporal expression and the relation phrase is based on a "be" verb.

Examples: The second extractions are all incorrect.

John Doe was in Washington on Thursday .

ollie extractions

0.91    John Doe    was in  Washington
0.88    John Doe    was on  Thursday

John Doe was in Washington for his vacation .

ollie extractions
0.91 John Doe was in Washington
0.85 John Doe was for his vacation

John Doe is in Washington to file his candidacy.

0.87    John Doe    is in   Washington
0.62    John Doe    is to file  his candidacy

how to set expandExtraction to false and run Ollie from source?

I tried to set expandExtraction to false in openparse.configuration. However when I try to run the example in here using following command I got errors.

mvn clean compile exec:java "-Dexec.mainClass=ollie.Example"

I got a long error but I am copying the last part:

lue for parameter num: Numeric[Double]
[ERROR]             if (overlaps.iterator.map(_._2).sum > 0.75) {
[ERROR]                                             ^
[ERROR] 20 errors found
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] ollie-core 1.0.4-SNAPSHOT .......................... FAILURE [ 11.231 s]
[INFO] ollie-app 1.0.1-SNAPSHOT ........................... SKIPPED
[INFO] ollie 1.0.0-SNAPSHOT ............................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  11.417 s
[INFO] Finished at: 2018-12-18T17:09:11-08:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:3.1.1:compile (default) on project ollie-core_2.9.2: wrap: org.apache.commons.exec.ExecuteException: Process exited with an error: 1(Exit value: 1) -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException

Any help regarding how I can modify and run Ollie on my dataset would be appreciated.

Conjunctions are always collapsed together

Among nonsmokers , birth control pills slightly raise a woman 's risk of abnormal blood clotting , high blood pressure , heart attack , and stroke .
(birth control pills; slightly raise a woman 's risk of abnormal blood clotting , high blood pressure , heart attack , and stroke among; nonsmokers)

The relation phrase in this case is jam-packed. It should probably be broken up into pieces, separating each conjunction. I can't come up with an example when this is invalid.

As a part of Jeff Bowden 's agreement with Florida State and Seminole Boosters , Inc. , he will receive an $107,500 annually , or $537,000 total , through August 2012 from the Booster club .
(he, will receive from, the Booster club)

After the release of her album and a guest-appearance with Gang Starr alongside Kurupt in 1998 , Rage left Death Row Records and the music industry generally to focus on acting , appearing in an episode of Kenan & Kel .
(Rage; left; Death Row Records and the music industry)

There are two dobjs in these examples because of the conjunction. Each should be handled separately. Some sentences have two dobjs without a conjunction. In this case, I have found it best to exclude all dobjs.

This would redefine the expansion stage to expand a single match into multiple extractions.

Ollie croaks on null

If a token is "null", Ollie dies.

Figure III-2-16. Ideal case of an unstable null point (Walton 1972) . . . . . . . . . . . . . . . . . . . . . . III-2-36
Exception in thread "main" org.maltparser.core.symbol.SymbolException: Symbol table error: empty string cannot be added to the symbol table
    at org.maltparser.core.symbol.trie.TrieSymbolTable.addSymbol(TrieSymbolTable.java:81)
    at org.maltparser.core.syntaxgraph.GraphElement.addLabel(GraphElement.java:31)
    at org.maltparser.core.syntaxgraph.SyntaxGraph.addLabel(SyntaxGraph.java:35)
    at org.maltparser.MaltParserService.parse(MaltParserService.java:149)
    at edu.knowitall.tool.parse.MaltParser.dependencies(MaltParser.scala:96)
    at edu.knowitall.tool.parse.MaltParser.dependencyGraph(MaltParser.scala:122)
    at edu.knowitall.ollie.OllieCli$$anonfun$run$1$$anonfun$12$$anonfun$apply$mcV$sp$1$$anonfun$apply$7$$anonfun$25.apply(OllieCli.scala:249)
    at edu.knowitall.ollie.OllieCli$$anonfun$run$1$$anonfun$12$$anonfun$apply$mcV$sp$1$$anonfun$apply$7$$anonfun$25.apply(OllieCli.scala:249)
    at scala.Option.map(Option.scala:133)
    at edu.knowitall.ollie.OllieCli$$anonfun$run$1$$anonfun$12$$anonfun$apply$mcV$sp$1$$anonfun$apply$7.apply(OllieCli.scala:249)
    at edu.knowitall.ollie.OllieCli$$anonfun$run$1$$anonfun$12$$anonfun$apply$mcV$sp$1$$anonfun$apply$7.apply(OllieCli.scala:241)
    at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:59)
    at scala.collection.immutable.List.foreach(List.scala:76)
    at edu.knowitall.ollie.OllieCli$$anonfun$run$1$$anonfun$12$$anonfun$apply$mcV$sp$1.apply(OllieCli.scala:241)
    at edu.knowitall.ollie.OllieCli$$anonfun$run$1$$anonfun$12$$anonfun$apply$mcV$sp$1.apply(OllieCli.scala:237)
    at scala.collection.Iterator$class.foreach(Iterator.scala:772)
    at scala.collection.Iterator$$anon$19.foreach(Iterator.scala:399)
    at edu.knowitall.ollie.OllieCli$$anonfun$run$1$$anonfun$12.apply$mcV$sp(OllieCli.scala:237)
    at edu.knowitall.common.Timing$.time(Timing.scala:59)
    at edu.knowitall.ollie.OllieCli$$anonfun$run$1.processSource$1(OllieCli.scala:226)
    at edu.knowitall.ollie.OllieCli$$anonfun$run$1$$anonfun$apply$12.apply(OllieCli.scala:290)
    at edu.knowitall.ollie.OllieCli$$anonfun$run$1$$anonfun$apply$12.apply(OllieCli.scala:289)
    at edu.knowitall.common.Resource$.using(Resource.scala:14)
    at edu.knowitall.ollie.OllieCli$$anonfun$run$1.apply(OllieCli.scala:289)
    at edu.knowitall.ollie.OllieCli$$anonfun$run$1.apply(OllieCli.scala:216)
    at edu.knowitall.common.Resource$.using(Resource.scala:14)
    at edu.knowitall.ollie.OllieCli$.run(OllieCli.scala:216)
    at edu.knowitall.ollie.OllieCli$.main(OllieCli.scala:174)
    at edu.knowitall.ollie.OllieCli.main(OllieCli.scala)

Ollie does not properly conjugate relations

Ollie presently does not attempt to conjugate relation verbs. For example, you might get the extraction (monkeys, be fed by, Humans) from the sentence "Humans feed monkeys." where be is not conjugated because it's inferred (or more properly, comes from a template) and the capitalization is taken directly from the sentence.

Sometimes the base verb has the wrong conjugation. Once again, this often happens when transforming an active sentence into a passive sentence. For example, you might have (Monkeys, are intentionally targeting by, humans) when ideally targeting would be targeted.

Add a missing "rel rel" pattern

In the following sentence we don't get the desired rel rel extraction from the first clause.

President Fred's forces have resorted to firing ballistic missiles at rebel fighters inside Syria, Obama administration officials said Wednesday, escalating a nearly two-year-old civil war as the government struggles to slow the momentum of a gaining insurgency.

(President Fred 's forces, have resorted to firing, ballistic missiles)

However, we do get a rel rel extraction when we change the second relation verb to a VB.

President Fred's forces have resorted to fire ballistic missiles at rebel fighters inside Syria, Obama administration officials said Wednesday, escalating a nearly two-year-old civil war as the government struggles to slow the momentum of a gaining insurgency.

(President Fred 's forces, have resorted to fire, ballistic missiles)

I need to add the missing pattern.

Ollie doesn't extract passive conversions

Previously we would have two extractions for the following sentence.

Ollie extracted the text.
(Ollie, extracted, the text)
(The text, was extracted from, Ollie)

These may have disappeared when we removed reflections of patterns. We removed these so we don't infer general patterns through reflexive seed relations (married) that switch the arg1 and the arg2.

Ollie mishandles parentheses

Society is extremely concerned with innovating new and improved products and novel elements of culture; these creations provide the short-term satisfaction many people (especially Americans) adore.

Americans ) adore many people

"be to" relation phrases

Sometimes the relation phrase has "be to" instead of "to be" when a template is used.

“Drop Trio was asked by SugarHill Recording Studios to record a song by Destiny 's Child.” – (a song, be to record by, Destiny’s Child);

“Ahmed Rushdi arrived in Pakistan to pursue a career as an actor.” – (a career, be to pursue as, an actor)

Is there has any code about Open Pattern Learning?

I want to do some research about the ollie.

I read the paper and I am interested in the part of open pattern learning.

And I want to now how to deduct pattern and template from lots of paths,you know a lot of details here

I review the code but not found the code,cause you already offer a model contains template and pattern,I am wondering whether you have the code and can you sharing it?

Thanks a lot !!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.