The pressing issues with this part of the pipeline are with robustness, scalability an

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

I have created a <a href="https://github.com/quangis/geo-question-parser/tree/merge-wi

Eliminate grammar parser/blockly interface overlap about geo-question-parser HOT 5 OPEN

quangis commented on September 23, 2024

Eliminate grammar parser/blockly interface overlap

from geo-question-parser.

Comments (5)

nsbgn commented on September 23, 2024

Thanks @HaiqiXu for the explanations in our meeting :) I have a better understanding now (I made some edits to the procedure sketch in my original post).

It turns out that our assumption (Simon's, Enkhbold's, mine) that the grammar parser and the Blockly interface do more or less the same thing (that is, implement the grammar) was not correct. This explains Haiqi's hesitation and means that unifying the grammar parser and the Blockly interface into a single module isn't as straightforward as we might have at first supposed.

That's because the recognition of the core concepts influences the way that grammatical components can combine.¹ Using the concept dictionary, some words are converted into the concepts they represent. It is only then that the phrase is passed to the parser, which identifies the functional roles.

This doesn't mean that we should keep the architecture as it is right now, because problems² remain:

There is overlap in functionality: both the blocks and the functional grammar seek to capture complexities of natural language, which is a hard enough problem to contend with once. This means that (1) there are two points of failure, (2) they may even interact in unpredictable ways, and (3) deep knowledge of both subsystems is necessary to make any change, since they must be maintained in parallel.
The parser is not resilient against the case where a word is not converted into any concept at all. This will inevitably happen time and time again when users are allowed free input.
Information is not passed between the two subsystems. In Blockly, we know exactly where a concept word can occur (we just don't know what type it is). We could already start identifying the concept and deal gracefully with any problems, or at least fail early. Instead, we throw this structural information away and let the parser throw a less understandable error at a later point.

I see two ways in which we can address these problems:

We might be able to coax blockly into producing dynamic blocks, where the content of inner blocks (ie. recognized concept types) can influence the output of outer blocks. We could then generate dynamic blocks from a straightforward representation of the grammar. This is elegant and would limit duplication, but could be difficult.
Failing that, we should at least disentangle the functionalities of both systems:
1. The Blockly interface has the exclusive responsibility of constraining natural language. It should produce a structured representation of important information, and anything it outputs should always be parseable (since the limited places where freeform text occurs can be labelled as such).
2. The grammar has the exclusive responsibility of identifying functional roles. It would likely look similar to what it is now, but its input would be structured, so we don't have to worry about it being fragile.

I'm not yet sure about either of these approaches, so I'm going to play with Blockly and the grammar and see what makes sense.

This is not true for steps that only "clean" the questions, and it also does not seem to be true for the named entity recognition: wherever a named entity can occur, we should already know what type that entity has because of the block it occurs in. ↩
Disregarding issues associated with the conversion of the parse tree into a graph or the conversion of concept types into into cct types, since they're not due to the interface/parser. ↩

from geo-question-parser.

nsbgn commented on September 23, 2024

Thinking about it more, I think the best approach is to create a declarative representation of the grammar that carries enough information to:

Automatically generate a Blockly interface
Automatically generate the procedures for determining the functional roles from Blockly's output

This way, the grammar can look much like Haiqi's original ANTLR implementation, with the associated ease of editing - but we can still get rid of the parsing step. Also, this would avoid spaghetti coding the two steps in an overzealous effort to unify them (despite the current overlap, there really are conceptually separate procedures at the cores of the two systems).

We will also need a tool to check whether all questions can still be represented with Blockly, because it would be a chore to test otherwise. This will be discussed in another issue when we get to it.

from geo-question-parser.

nsbgn commented on September 23, 2024

I have created a merge-with-interface branch to this repository, to track the process of unifying the two subprojects. This branch is relevant for all subsequent issues (#2, #3, #4, #5, #6, #7, #8, #9).

While I expect the main and haiqi branches to become legacy, they may still receive updates while work is done on the interface. They also probably won't be removed, since the way things are done there corresponds to how things are done in the papers. In other words, it makes sense to continue working on these branches.

However, changes to the Blockly interface should probably not be made to https://github.com/quangis/quangis-web or https://github.com/HaiqiXu/haiqixu.github.io, but here, instead. (And, in time, removed from quangis-web.)

from geo-question-parser.

nsbgn commented on September 23, 2024

For reference: I can verify that the recognition of core concepts indeed influences the parsing process, as mentioned in this comment. Consider:

serviceObj
    :   ((time|quantity) 'and'?)+
        'of'? networkQ (('from'|'for'|'of') origin)?
        ('to' destination)?;

networkQ is a recognized entity that is explicitly separate from other coreC.

from geo-question-parser.

nsbgn commented on September 23, 2024

Following up on the previous comment, we need a separate module for recognizing concepts. This module would have only one responsibility: converting a string into a concept.^[1] This is a difficult task on its own, so it's imperative that it is understandable and replacable (!) in isolation, not muddled by other concerns like parsing a whole sentence. It can be wrapped up in a service, depending on where it is needed.

Now, suppose that we accept, for example, a network in a block where the grammar would require a field. Is that a problem?

If the concept inside the block fundamentally influences the context in which it can be applied --- which functional roles are ascribed to it by the grammar --- then things get a bit complicated, because we must limit the blocks to which it can be attached.
1. We could give the user two otherwise identical blocks with different example text, and have them intuitively infer which one they can use. This places some of the burden of understanding GIS concepts on the user.
2. We can have a dynamic block that communicates with the concept recognition module to dynamically change the block's shape depending. This involves an uncomfortable amount of magic, both in implementation and in user experience.
We can do a combination of these.
If, instead, the block can still be sensibly used in all contexts, with the functional roles unaffected, then the concept recognition only serves to pin down the types. This can be done entirely on the server end after the user has finished constructing their queries, as a part of the conversion to the transformation graph.
If the block can be used sensibly in most contexts, but not all, then we can still do everything on the server end and just throw an error in those few cases that the blockly construction does not accord with the grammar. In this case, we probably want to have the blocks output some extra information for verification, so that we don't have to maintain two grammar implementations just for those edge cases.

In any case, the concept recognizer should be isolated.

from geo-question-parser.

Eliminate grammar parser/blockly interface overlap about geo-question-parser HOT 5 OPEN

Comments (5)

Related Issues (9)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Comments (5)

Footnotes

Related Issues (9)

Recommend Projects

Recommend Topics

Recommend Org