Giter VIP home page Giter VIP logo

gojsonsm's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

gojsonsm's Issues

Extra Parenthesis leads to incorrect logic

For this expression:
(county = "United States" OR country = "Canada" AND type="brewery") OR (type="beer" AND DATE(updated) >= DATE("2019-01-01"))

things work correctly.

However, when extra parenthesis are added:
((county = "United States" OR country = "Canada") AND type="brewery") OR (type="beer" AND DATE(updated) >= DATE("2019-01-01"))

the logic is broken.

Need investigating on why the AST is different.

Add support for multiple roots.

The XDCR team at Couchbase has requested that it be possible to have multiple roots specified (ie: provide matching against explicitly separately defined documents.

Add Support for LIKE (Regexp)

JSONSM should support performing filtering based on a regexp string match. This is an issue to track ownership, requirements and implementation.

Null Fastval should not output as 0 when used as numeric values

Right now a NullValue Fastval outputs to 0 or 0.0 when its AsUint()/AsInt()/AsFloat64() is called.
Since 0 is commonly used as values, it could easily lead to an incorrect match. Consider the following test case that incorrectly matches:

	fe = &FilterExpression{}
	err = parser.ParseString("fieldpath.path IS NOT NULL", fe)
...
	userData := map[string]interface{}{
		"fieldpath": map[string]interface{}{
			"path": 0,
		},
	}
	udMarsh, _ := json.Marshal(userData)
	match, err := m.Match(udMarsh)
...

fieldpath.path is not null in this case but because the nil value on the right side of EqualsExpr inserted by the parser equates to 0 when matches with LHS's 0(int), this evaluation results in the incorrect value.

The simplest way to do it IMO is to make NullValue output MinInt64/MaxUint64/MaxFloat64 when AsInt()/AsUint()/AsFloat() is called. Chances are, those values are unlikely to exist in real world scenarios.

Another option is to do more complicated checks in CompareOp, which is more correct, but that will break the pretty symmetric function of the current code.

Thoughts, @brett19 ?

Parenthese parsing issues

A few odd parenthesis parsing cases:

This is valid:
(country="United States" OR country="Canada") AND type="brewery"
But this is invalid:
((country="United States" OR country="Canada") AND type="brewery")

due to potential recapturing of the AND clause?

Matcher and simpleParser to support Time based comparison

One of the functions that would be good to have is to have time comparisons in the matcher.
The thinking behind is to have a new type of FastVal called "TimeValue", which will hold a golang Time struct, and time provides us comparators of Before, After, Equal, etc. We'll be sticking with the RFC-3339 format that is used in the golang time library, since N1QL uses ISO-8601, but Golang doesn't directly support ISO-8601.

The miss here is that a lot of documents may have the values of "YYYY-MM-DD" and we can't do matching on that. Perhaps we can address that in another PR/issue. Is there a good way to address this at this point, though?

For the corner cases where we're doing different scope of time comparison, we will just simply pass in and let time library do comparison. For example:
If user is checking for a field is equivalent to a time: "2018-11-21T00:01:02Z", but the document field has the actual time value of "2018-11-21T00:01:02.03Z", the the Equal() operator will return false.

The workaround is for user to specify a range that covers that specific time in the document, as we don't want to spend the resources guessing the user's intent.

Failure when inner loop references not-yet-parsed field from outer loop

Given this JSON document:

{
  "foo": [
    {
      "bar": [1, 2, 3],
      "zot" : 2
    }
  ]
}

The following match expression fails because zot appears in the JSON after bar, and the code that defers loop evaluation only looks for fields rooted at $doc (ignores fields rooted at outer loops).

any $f in $doc.foo {
    any $b in $f.bar {
        $f.zot == $b
    }
}

Implement mathRound function in parser

Matcher has support for mathRound. This issue tracks the implementation of functions, and specifically, ROUND() function for simpleParser. The goal is to set it up so future functions implementations are painless.

Multi-token parser loop bug

Part of the SimpleParser's multi token loop is incorrect as it should retry a whole loop should something be marked invalid

support for full object/array comparisons?

I think it might make sense to support comparing objects and/or arrays using the Equals/NotEquals operators. For objects, this would come in the form of comparison that the keys and values match (but not necessarily the ordering), and for arrays that the elements and ordering match. I think this might actually be possible to implement today using a compound operator which decomposes an array or object into a set of EQUALs checks (and maybe something to confirm the length of the object/array matching).

Add support for functions as part of filter expressions

There are a few functions which make a lot of sense to be included in JSONSM. Some of the ones that particularly come to mind are non-trivial mathematical functions, or date/time style functions (since JSON doesn't have one set standard here). JSONSM should be expanded to support this. This issue is for discussion on the possible implementations of this.

Support for (negative) lookahead/lookbehind using pcre

One of the requirements for XDCR as a consumer of gojsonsm is to able to support (negative) lookahead/lookbehind (MB-30311). From various conversations in the past, we have decided to go with pcre as the library of choice for doing such matching.

This issue tracks the gojsonsm side of things.

NotEquals transformation should use NOT expression

At the end, resolve() marks any unresolved to false.

But logically, given an expression
field1 <> "value"
where field1 is not present, the result will evaluate to false.

Since field1 is not present, logically, this statement should be true.

Matcher doesn't perform EXIST correctly on key with value of embedded map

Given a data structure of:

	userData := map[string]interface{}{
		"KEY": map[string]interface{}{
			"internalKey": "value",
		},
	}

A filter expression of "KEY EXISTS" fails, even though technically it does exist.
From what I can tell, matchExec sees KEY, and then goes ahead and does 2 more token gets, which are ":" followed by "{".

Something like this:

NEIL DEBUG testMap: map[[$%XDCRInternalKey*%$]:TestDocKey Key:AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA [$%XDCRInternalMeta*%$]:map[AnotherXattr:TestValueString TestXattr:30]]
NEIL DEBUG Expr: $doc.[$%XDCRInternalMeta*%$] EXISTS
NEIL DEBUG objStart going into objOrArray
NEIL DEBUG tokenData: "[$%XDCRInternalKey*%$]"
NEIL DEBUG autostep tokenData: :
NEIL DEBUG autostep2 tokenData: "TestDocKey"
NEIL DEBUG KeyString: [$%XDCRInternalKey*%$]
NEIL DEBUG tokenData: "Key"
NEIL DEBUG autostep tokenData: :
NEIL DEBUG autostep2 tokenData: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
NEIL DEBUG KeyString: Key
NEIL DEBUG tokenData: "[$%XDCRInternalMeta*%$]"
NEIL DEBUG autostep tokenData: :
NEIL DEBUG autostep2 tokenData: {
NEIL DEBUG KeyString: [$%XDCRInternalMeta*%$]
NEIL DEBUG KeyString found with token: 1 tokenData: { keyElem: :ops
[0] @ exists @

So it sees that there is no operations for tokenData "{" and then bails, and the original [0] got lost.

Matcher to support negative array index

N1QL supports negative index on array, so people can do something like arr[-1]. Golang seems to not want to support it in its language construct. But in trying to keep it similar to N1QL, I'm thinking we may need to support this at the matcher level.
Any thoughts on this?

Matcher finds slot start/end via odd means

We currently back up the current position by the tokenData length:

startPos -= len(tokenData)

This is probably not entirely safe, and may lead to odd bugs. We should probably update the tokenizer to support fetching the last position, or potentially pass the last position through to matchExec so it can use that.

Implement misc math functions

Now that simpleParser and matcher both have basic math functions framework, this issue tracks other math functions that should be implemented

Add support for merged expressions

We should add support for taking multiple expressions and generating a matcher definition which can match against multiple of these expressions at once in one pass.

NOT objects do not invert unresolved values

An expression such as NOT(name eq "frank") with the name field not existing will cause the expression to fail. This may be expected behaviour, but I think that the more intuitive behaviour would be to implement some form of post-completion resolution of unresolved leaf nodes to false.

FastVal IsString function is called IsStringLike

The existing logic in FastVal makes it so that checking a specific datatype is done using the dataType field (possibly should be moved to a function). However, the method to check if something is string-typed is called IsStringLike rather than simply IsString. This should be fixed to match the rest of the functions.

Short loop expression will incorrectly become resolved

In the case of a very simple loop, the binTree bucket of the loop ends up being the root element, this causes the IsResolved checking within the matcher to trigger when the loop runs once, rather than when the stallIndex is reset after the loop.

After blocks get misplaced in the match tree

For some more complicated expressions, when the dependancies are entirely within a subtree, the after block is placed on the root of the document instead of on the subtree.

Expression:
		$doc.name.first = $doc.name.last
Transformed:
		match tree:
		  :elems
		    `name`:
		      :elems
		        `first`:
		          :store $1
		        `last`:
		          :store $2
		  :after:
		    #with $1:
		      :ops
		        [0] eq $2
		bin tree:
		  [0:0] leaf
		match buckets:
		  0: 0
		num buckets: 1
		num fetches: 2
		max depth: 1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.