hunterhector / evmeval Goto Github PK

View Code? Open in Web Editor NEW

27.0 27.0 11.0 55.22 MB

The event mention detection and corefrene evaluators, and associated utilities (converters, validators)

Home Page: http://hunterhector.github.io/EvmEval

CSS 4.33% JavaScript 38.72% HTML 0.57% Shell 0.53% Python 23.51% Java 18.31% Perl 10.58% Perl 6 3.45%

evmeval's People

Contributors

Stargazers

Watchers

Forkers

ddua aureliu matrix-revolution karthi2016 anukat2015 agoloaliseyfi aliseyfi simon-clematide luyaojie jasonzou ar97sg

evmeval's Issues

converter throw exception while providing -h.

Way to reproduce:

java -jar bin/converter-1.0.0-jar-with-dependencies.jar -h

demo_score.sh and scorer_test.py are outdated

Hi there!

#demo_score.sh is outdated. Running it produces the following output:

Evaluating system A, should be a perfect system

[INFO] 2017-09-06 15:55:35,544 : Evaluation output will be saved at data/scoring_demo/score_prefect.scores
[ERROR] 2017-09-06 15:55:35,544 : Cannot find gold standard file at data/scoring_demo/gold.tbf
Stored score report at data/scoring_demo/score_prefect.tmp

After the following changes to the script:

scorer=./scorer_v1.8.py

echo "Evaluating system A, should be a perfect system\n"
$scorer -g data/scoring_demo/token_based/gold.tbf -s data/scoring_demo/token_based/sample_system_A.tbf -d data/scoring_demo/A_out.tmp -t data/scoring_demo/token_based/tkn/ --eval_mode token -o data/scoring_demo/score_prefect.scores
echo "Stored score report at data/scoring_demo/score_prefect.tmp\n"

The first example works:

Evaluating system A, should be a perfect system

[INFO] 2017-09-06 15:59:19,851 : Evaluation output will be saved at data/scoring_demo/score_prefect.scores
[INFO] 2017-09-06 15:59:19,852 : Coreference mentions need to match mention_type before consideration
[INFO] 2017-09-06 15:59:19,852 : Evaluating Document example1
[INFO] 2017-09-06 15:59:19,852 : Evaluating Document example2
[INFO] 2017-09-06 15:59:19,857 : Evaluation Done.
Stored score report at data/scoring_demo/score_prefect.tmp

scorer_test.py is also outdated:

$ ./scorer_test.py
Please see test_tmp/test.log for test logs.
Running mention tests
Running format error test cases.
Traceback (most recent call last):
File "scorer_v1.8.py", line 1145, in
main()
File "scorer_v1.8.py", line 266, in main
diff_out):
File "scorer_v1.8.py", line 974, in evaluate
parse_result = parse_line(sl, invisible_ids)
File "scorer_v1.8.py", line 688, in parse_line
spans = parse_characters(fields[3])
File "scorer_v1.8.py", line 646, in parse_characters
span = list(map(int, span_strs.split(Config.span_joiner)))
ValueError: invalid literal for int() with base 10: 't3'

Best,
Peter

Incomplete/empty token list generated by TokenFileMaker

The token file maker read the gold standard file to make sure the alignment is done.

On the following line (case 2), the evmBegin is smaller the token, then it moved on.

https://github.com/hunterhector/EvmEval/blob/master/token-file-maker/src/main/java/evmeval/TokenFileMaker.java#L88

However, when the event is the last event mention, that means all tokens after this mention is ignored, hence not printed in token list.

You can reproduce the error by running the following command.

java -jar bin/token-file-maker-1.0.3-jar-with-dependencies.jar -a data/test_cases/conll_tests/ann -e txt -t data/test_cases/conll_tests/txt -o debug

Then you can find in debug the token list, which only have one sentence, while the original one contains 2 sentences.

Make token converter file extensions specifiable through command line.

Current ldx-xml-to-token converter assumes some extension:

text file (with tags): *.mpdf.xml
annotation file: *.rich_ere.xml

Due to the frequent change of LDC's annotation naming convention :( , this thing easily stop working. And there is no relevant error message. If one run our code, the result will be "successful" yet the final tbf will be empty. This will be annoying and hard to find out.

I am proposing the following enhancement:

Pass extension as arguments
Report number of files get converted (probably give a informative warning when there are 0 files get converted)

What do you think?

Prepare to read the new tokenization mapping format

We will need to change to a new tokenization mapping, which only contains the tokenized output.

Test cases for CoNLL test

We need to create more test cases for CoNLL test.

The current automatic testing script is ready. We then need to create more test cases.

The guide for test cases can be found here:

https://github.com/hunterhector/EvmEval/blob/master/data/test_cases/conll_tests/README.md

You can drag and create test cases using our cairo online tool. Contact me for more details

Scorer doesn't work and validation fails

Hi, sorry to bother again.

I tried to use scorer to evaluate the generated "gold.tbf" against itself following "demo_scorer.sh", which should perfectly match.
However, this program stuck after printing the information of the first doc on the terminal.

Then I tried "validator.py" to examine "gold.tbf", and it gives following error (part of it):

[ERROR] 2015-08-11 10:48:43,257 : Cannot find token file for doc [3f71fead3fa119ccdcdf01769ffee5b1] at [/Users/luo123n/Projects/event_detection/data/conversion/tkn/3f71fead3fa119ccdcdf01769ffee5b1.txt.tab], did you use correct file paths?

It seems "validator.py" expects the token file ends with ".txt.tab", while the tokenizer gives ".tab".

I'm not sure if this is the cause of the disfunction of scorer.

Converter doesn't work for LDC2015E73

I'm a participant of TAC 2015 Event Nugget Detection task.
I think the training data for this year is LDC2015E73 (TAC KBP 2015 Event Nugget Training Data Annotation), right?

However, the converter doesn't work this data set. (I tried 'demo_conversion.sh')
The xml file names look like: 1b386c986f9d06fd0a0dda70c3b8ade9.event_nuggets.xml
There is no mpdf file and rich_ere file.
This doesn't get better when I change the file name above to: 1b386c986f9d06fd0a0dda70c3b8ade9.rich_ere.xml

Am I using the wrong data?
If not, can you please tell me how can I evaluate the performance of my system on this dataset?

Thanks a lot.

MUC Scorer inconsistency

In the case when there is no gold coreference and no system coreference, the metrics in CoNLL scorer will behave differently. CEAF, B-Cubed and BLANC will give score of 100%, MUC will give score of 0. Such cases might be rare. Consider fire an issue to the CoNLL reference implementation.

Subevent Evaluation

Hello,
Thanks for providing this scorer. I am just wondering how to evaluate the subevents without the After relation.
Thanks

Additional space for "tab" files.

In the "Tab" token file, there are additional spaces between each field, the following line:

1 my 66 68

Each field is separated on a tab and a space. Luckily this should be easy to get around in later pipelines, but we'd better fix it.

brat2tbf

I am having this error
kindly let me know how to run this code

Asad:EvmEval-master asad$ ./util/brat2tbf.py ajax.ann -t util/ajax.txt [-o OUT]
usage: brat2tbf.py [-h] (-d DIR | -f FILE) [-t TOKEN_PATH] [-o OUT] [-oe EXT]
[-i EID] [-w] [-of OFFSET_FIELD]
[-te TOKEN_TABLE_EXTENSION] [-ae ANNOTATION_EXTENSION] [-b]
brat2tbf.py: error: one of the arguments -d/--dir -f/--file is required
Asad:EvmEval-master asad$ ./util/brat2tbf.py ajax.ann -t util [-o OUT]
usage: brat2tbf.py [-h] (-d DIR | -f FILE) [-t TOKEN_PATH] [-o OUT] [-oe EXT]
[-i EID] [-w] [-of OFFSET_FIELD]
[-te TOKEN_TABLE_EXTENSION] [-ae ANNOTATION_EXTENSION] [-b]
brat2tbf.py: error: one of the arguments -d/--dir -f/--file is required

Hanging when evaluating many events

Hello, I am trying to use @after evaluation with many events ~400 events in a document. I understand it does some sort of transitive closure but it took a very long time (not even finished). Do you suggest any idea to scale it?

My project requirement is to evaluate a gold document with ~400 events (on @after) with the system output. All just one document.

rich_ere to btf conversion

Hi Hector,

I have a question. I see that there are two ways of converting from rich ere xmls to tbf:

In one step, via bin/rich_ere_to_tbf_converter.jar
Via brat format: first bin/rich_ere_to_brat_converter.jar, then util/brat2tbf.py

Assuming that I use character mode, am I expected to always get the same output from both conversion methods?

Thanks and best,
Peter

Rich ERE to Brat converter offsets problem

I try to convert the LDC2016E71_TAC_KBP_2016_Eval_Core_Set_Rich_ERE_Annotation ERE data to brat.
I use:
java -jar bin/rich_ere_to_brat_converter.jar -t "$ldc_text_dir" -te "xml" -a "$ldc_nugget_dir" -ae "rich_ere.xml" -o "$brat_output_dir"

It generates warnings for all files:
[WARN] Invalid offset 1961, 1969 of event mention [ police ] for string [displace] at doc [NYT_ENG_20130716.0108]
[WARN] Invalid offset 1988, 1996 of event mention [to displ] for string [showered] at doc [NYT_ENG_20130716.0108]
[WARN] Invalid offset 2163, 2169 of event mention [ 1 a.m] for string [firing] at doc [NYT_ENG_20130716.0108]
[WARN] Invalid offset 2252, 2260 of event mention [owds of ] for string [gathered] at doc [NYT_ENG_20130716.0108]
[WARN] Invalid offset 2252, 2260 of event mention [owds of ] for string [gathered] at doc [NYT_ENG_20130716.0108]
[WARN] Invalid offset 2388, 2395 of event mention [ple wer] for string [clashes] at doc [NYT_ENG_20130716.0108]
[WARN] Invalid offset 2397, 2409 of event mention [injured in t] for string [according to] at doc [NYT_ENG_20130716.0108]
[WARN] Invalid offset 2557, 2564 of event mention [elled g] for string [killing] at doc [NYT_ENG_20130716.0108]
[WARN] Invalid offset 2582, 2590 of event mention [a bus, k] for string [injuring] at doc [NYT_ENG_20130716.0108]
[WARN] Invalid offset 2582, 2590 of event mention [a bus, k] for string [injuring] at doc [NYT_ENG_20130716.0108]
[WARN] Invalid offset 2943, 2951 of event mention [et with ] for string [reported] at doc [NYT_ENG_20130716.0108]
[WARN] Invalid offset 2612, 2620 of event mention [d injuri] for string [reported] at doc [NYT_ENG_20130716.0108]
[WARN] Invalid offset 3100, 3103 of event mention [con] for string [aid] at doc [NYT_ENG_20130716.0108]

It seems that it is a validation offset problem as it compares text with offset -30 with the actual annotation. In the actual ERE everything seems okay.

As it is just a warning I would expect that it will output something but it generates empty output.

Converters do not give meaningful error message

Current ldc-to-xml converter do not give clear error message at some places. For example, when the "-o" option is not provided, it just throw NullPointerException:

Exception in thread "main" java.lang.NullPointerException
at java.io.File.(File.java:277)
at converter.LdcXmlToBratConverter.main(LdcXmlToBratConverter.java:47)

Prepare a package for release

Better error outputs for convertor maybe?

Would it be possible to make the error message more specific? We would like to know which particular directory is not a directory, while we probably don't care much about the parameter name.

For example:

java.lang.IllegalArgumentException: Parameter 'directory' is not a directory
at org.apache.commons.io.FileUtils.validateListFilesParameters(FileUtils.java:545)
at org.apache.commons.io.FileUtils.listFiles(FileUtils.java:521)
at org.apache.commons.io.FileUtils.listFiles(FileUtils.java:691)
at evmeval.TokenFileMaker.createFiles(TokenFileMaker.java:206)
at evmeval.TokenFileMaker.main(TokenFileMaker.java:325)
[ERROR] Stopped due to an error.