Giter VIP home page Giter VIP logo

Comments (5)

nsbgn avatar nsbgn commented on June 27, 2024

Thanks @HaiqiXu for the explanations in our meeting :) I have a better understanding now (I made some edits to the procedure sketch in my original post).

It turns out that our assumption (Simon's, Enkhbold's, mine) that the grammar parser and the Blockly interface do more or less the same thing (that is, implement the grammar) was not correct. This explains Haiqi's hesitation and means that unifying the grammar parser and the Blockly interface into a single module isn't as straightforward as we might have at first supposed.

That's because the recognition of the core concepts influences the way that grammatical components can combine.1 Using the concept dictionary, some words are converted into the concepts they represent. It is only then that the phrase is passed to the parser, which identifies the functional roles.

This doesn't mean that we should keep the architecture as it is right now, because problems2 remain:

  • There is overlap in functionality: both the blocks and the functional grammar seek to capture complexities of natural language, which is a hard enough problem to contend with once. This means that (1) there are two points of failure, (2) they may even interact in unpredictable ways, and (3) deep knowledge of both subsystems is necessary to make any change, since they must be maintained in parallel.
  • The parser is not resilient against the case where a word is not converted into any concept at all. This will inevitably happen time and time again when users are allowed free input.
  • Information is not passed between the two subsystems. In Blockly, we know exactly where a concept word can occur (we just don't know what type it is). We could already start identifying the concept and deal gracefully with any problems, or at least fail early. Instead, we throw this structural information away and let the parser throw a less understandable error at a later point.

I see two ways in which we can address these problems:

  1. We might be able to coax blockly into producing dynamic blocks, where the content of inner blocks (ie. recognized concept types) can influence the output of outer blocks. We could then generate dynamic blocks from a straightforward representation of the grammar. This is elegant and would limit duplication, but could be difficult.

  2. Failing that, we should at least disentangle the functionalities of both systems:

    1. The Blockly interface has the exclusive responsibility of constraining natural language. It should produce a structured representation of important information, and anything it outputs should always be parseable (since the limited places where freeform text occurs can be labelled as such).

    2. The grammar has the exclusive responsibility of identifying functional roles. It would likely look similar to what it is now, but its input would be structured, so we don't have to worry about it being fragile.

I'm not yet sure about either of these approaches, so I'm going to play with Blockly and the grammar and see what makes sense.

Footnotes

  1. This is not true for steps that only "clean" the questions, and it also does not seem to be true for the named entity recognition: wherever a named entity can occur, we should already know what type that entity has because of the block it occurs in.

  2. Disregarding issues associated with the conversion of the parse tree into a graph or the conversion of concept types into into cct types, since they're not due to the interface/parser.

from geo-question-parser.

nsbgn avatar nsbgn commented on June 27, 2024

Thinking about it more, I think the best approach is to create a declarative representation of the grammar that carries enough information to:

  1. Automatically generate a Blockly interface
  2. Automatically generate the procedures for determining the functional roles from Blockly's output

This way, the grammar can look much like Haiqi's original ANTLR implementation, with the associated ease of editing - but we can still get rid of the parsing step. Also, this would avoid spaghetti coding the two steps in an overzealous effort to unify them (despite the current overlap, there really are conceptually separate procedures at the cores of the two systems).

We will also need a tool to check whether all questions can still be represented with Blockly, because it would be a chore to test otherwise. This will be discussed in another issue when we get to it.

from geo-question-parser.

nsbgn avatar nsbgn commented on June 27, 2024

I have created a merge-with-interface branch to this repository, to track the process of unifying the two subprojects. This branch is relevant for all subsequent issues (#2, #3, #4, #5, #6, #7, #8, #9).

While I expect the main and haiqi branches to become legacy, they may still receive updates while work is done on the interface. They also probably won't be removed, since the way things are done there corresponds to how things are done in the papers. In other words, it makes sense to continue working on these branches.

However, changes to the Blockly interface should probably not be made to https://github.com/quangis/quangis-web or https://github.com/HaiqiXu/haiqixu.github.io, but here, instead. (And, in time, removed from quangis-web.)

from geo-question-parser.

nsbgn avatar nsbgn commented on June 27, 2024

For reference: I can verify that the recognition of core concepts indeed influences the parsing process, as mentioned in this comment. Consider:

serviceObj
    :   ((time|quantity) 'and'?)+
        'of'? networkQ (('from'|'for'|'of') origin)?
        ('to' destination)?;

networkQ is a recognized entity that is explicitly separate from other coreC.

from geo-question-parser.

nsbgn avatar nsbgn commented on June 27, 2024

Following up on the previous comment, we need a separate module for recognizing concepts. This module would have only one responsibility: converting a string into a concept.^[1] This is a difficult task on its own, so it's imperative that it is understandable and replacable (!) in isolation, not muddled by other concerns like parsing a whole sentence. It can be wrapped up in a service, depending on where it is needed.

Now, suppose that we accept, for example, a network in a block where the grammar would require a field. Is that a problem?

  1. If the concept inside the block fundamentally influences the context in which it can be applied --- which functional roles are ascribed to it by the grammar --- then things get a bit complicated, because we must limit the blocks to which it can be attached.

    1. We could give the user two otherwise identical blocks with different example text, and have them intuitively infer which one they can use. This places some of the burden of understanding GIS concepts on the user.
    2. We can have a dynamic block that communicates with the concept recognition module to dynamically change the block's shape depending. This involves an uncomfortable amount of magic, both in implementation and in user experience.

    We can do a combination of these.

  2. If, instead, the block can still be sensibly used in all contexts, with the functional roles unaffected, then the concept recognition only serves to pin down the types. This can be done entirely on the server end after the user has finished constructing their queries, as a part of the conversion to the transformation graph.

  3. If the block can be used sensibly in most contexts, but not all, then we can still do everything on the server end and just throw an error in those few cases that the blockly construction does not accord with the grammar. In this case, we probably want to have the blocks output some extra information for verification, so that we don't have to maintain two grammar implementations just for those edge cases.

In any case, the concept recognizer should be isolated.

from geo-question-parser.

Related Issues (9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.