Giter VIP home page Giter VIP logo

Comments (10)

OriShapira avatar OriShapira commented on June 9, 2024 1

This sentence
"Buddhists and Muslims raping and killing each other."
outputs:
'{A2} {A1} {A1} killing {A3}'
and it's left in the output JSON.

Thanks.

from okr.

gabrielStanovsky avatar gabrielStanovsky commented on June 9, 2024

@OriShapira, Can you specify in which sentence this happens?

from okr.

OriShapira avatar OriShapira commented on June 9, 2024

In tweet 258795112623116288:
"Boy Scouts files on alleged sex abusers to be released -".

from okr.

gabrielStanovsky avatar gabrielStanovsky commented on June 9, 2024

I don't think it's a bug in the PropS wrapper.
This is what I get from running it as a single sentence (note the template doesn't repeat arguments):

{'Entities': {'A1': ('Boy Scouts files on alleged sex abusers to be released -',
                     (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)),
              'A2': ('alleged sex abusers', (4, 5, 6)),
              'A3': ('on', (3,))},
 'Predicates': {'P1': {'Arguments': ['A1', 'A2', 'A3'],
                       'Bare predicate': ('to be released', (7, 8, 9)),
                       'Head': {'Lemma': u'release',
                                'POS': 'VBN',
                                'Surface': ('released', [9])},
                       'Template': '{A1} {A3} {A2} to be released'}},
 'Sentence': 'Boy Scouts files on alleged sex abusers to be released -'}

from okr.

OriShapira avatar OriShapira commented on June 9, 2024

Maybe A1 should be something like "Boy Scouts files", so that the template '{A1} {A3} {A2} to be released' would make more sense.
@kleinay, is this what might be causing the JSON generator to unexpectedly use A1 twice?

from okr.

gabrielStanovsky avatar gabrielStanovsky commented on June 9, 2024

Side note - if I run the PW on a cleaner version of the sentence (replacing the dash at the end with a full stop), I get something slightly saner (see below).
Maybe we should consider a cleaning phase for the tweets to remove this kind of noise that messes up automatic parsers.

{'Entities': {'A1': ('alleged sex abusers', (4, 5, 6)),
              'A2': ('Boy Scouts', (0, 1))},
 'Predicates': {'P1': {'Arguments': ['A1', 'A2', 'P2'],
                       'Bare predicate': ('files on', (2, 3)),
                       'Head': {'Lemma': u'file',
                                'POS': 'VBZ',
                                'Surface': ('files', [2])},
                       'Template': '{A2} files on {A1} {P2}'},
                'P2': {'Arguments': ['A2'],
                       'Bare predicate': ('to be released', (7, 8, 9)),
                       'Head': {'Lemma': u'release',
                                'POS': 'VBN',
                                'Surface': ('released', [9])},
                       'Template': '{A2} to be released'}},
 'Sentence': 'Boy Scouts files on alleged sex abusers to be released .'}

from okr.

gabrielStanovsky avatar gabrielStanovsky commented on June 9, 2024

Okay, I introduced a small fix, this looks better now on my side (see below).
The "on" error is from wrong underlying parsing, caused by the dash at the end.
@kleinay, after you merge the PR we can test if the issue of repeating arguments is also resolved.

{'Entities': {'A1': ('Boy Scouts files', (0, 1, 2)),
              'A2': ('sex abusers', (5, 6)),
              'A3': ('on', (3,))},
 'Predicates': {'P1': {'Arguments': ['A1', 'A2', 'A3'],
                       'Bare predicate': ('to be released', (7, 8, 9)),
                       'Head': {'Lemma': u'release',
                                'POS': 'VBN',
                                'Surface': ('released', [9])},
                       'Template': '{A1} {A3} {A2} to be released'}},
 'Sentence': 'Boy Scouts files on alleged sex abusers to be released -'}

from okr.

kleinay avatar kleinay commented on June 9, 2024

Following Gabi's fixes, I cannot see if the problem persist - The example-tweet is not making a duplicate-argument template anymore, but this is still in principal a possible scnerio. @OriShapira , please update if you can see the same problem anywhere after the fix.

from okr.

kleinay avatar kleinay commented on June 9, 2024

In principal, the scenerio of duplicated argument-slots in same template is currently possible, due to the fact that the argument-aligment algorithm we use is trivial - we assign all argument-mention refering to the same concept to the same argument slot (i.e proposition-level argument - denoted by a.1, a.2 etc.). Thus, in predicates where two argument refer the same concept (e.g. "Bob picture himself"), the template would contain the same argument slot ("a.1 picture a.1"), which is forbidden.
I commited a fix to handle this scenrio explicitly. @OriShapira , if you encounter this problem, we would run the auto-pipeline again with the fix to verify it solves the problem.

from okr.

kleinay avatar kleinay commented on June 9, 2024

this is fixed now I think, @OriShapira try to create summarization please.

from okr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.