this is the continuation of a topic that started at <a class="issue-link js-issue-link

okay. so actionable items? move to [

should all tags have user defined keys? about metrictank HOT 8 CLOSED

grafana commented on May 18, 2024

should all tags have user defined keys?

from metrictank.

Comments (8)

woodsaj commented on May 18, 2024

The problem i have with single strings is that it makes querying via Elasitcsearch much more difficult and less intuitive. Elasticsearch will already tokenize the document, splitting on whitespace and other common deliminators like ":=,"

Eg.
With the following example documents

{
  "title": "Doc1",
  "tags": [
    "key1=foo1",
    "key2=foo2"
  ]
},
{
  "title": "Doc2",
  "tags": [
    "key5=foo1",
    "key2=foo3"
  ]
},
{
  "title": "Doc3",
  "tags": [
    "foobar"
  ]
},
{
  "title": "Doc4",
  "tags": [
    "key1=foo1",
    "key1=bar2"
  ]
},

if i wanted to match every document where key1==foo1, then i would need to quote my search query.

{
  "query": {
    "query_string": {
        "query": "tags:\"key1=foo1\""
    }
  }
}

which would match documents 1 and 4

The gotcha here, is that if you dont provide the quotes, the search will match all documents that have a tag that contains "key1" or "foo1", matching documents 1,2 and 4. This is certainly not the intended result.

Additionally because of the quoting, you lose the ability to do partial matches, ie, where the value of key "key2" starts with "foo".

However, as a benefit, if you just provided a search query of "tags:foo1" it would return all documents that have a tag that has a key or a value set to "foo1"

with a key:value schema

{
  "title": "Doc1",
  "tags": {
    "key1": "foo1",
    "key2": "foo2"
  }
},
{
  "title": "Doc2",
  "tags": {
    "key5": "foo1",
    "key2": "foo3"
  }
},
{
  "title": "Doc3",
  "tags": 
    "foobar": "true"
  }
},
{
  "title": "Doc4",
  "tags": {
    "key1": "foo1 bar2",
  ]
},

The same search query to match where key1==foo1 becomes

{
  "query": {
    "query_string": {
        "query": "tags.key1:foo1"
    }
  }
}

No quoting necessary.

to match where key2 starts with foo

 "query": "tags.key2:foo*"

It would not be possible (to my knowledge) to send a query that would match either the key or the value. But you can match the value across all keys by including a "fields" field to limit the scope of the query.

{
  "query": {
    "query_string": {
       "fields": ["tags.*"],
       "query": "foo1"
    }
  }
}

which would match documents 1, 2 and 4

Including the "fields" field on all queries would probably be a good practice anyway and wouldnt affect tag.key:value format queries

from metrictank.

woodsaj commented on May 18, 2024

ok. So i have been reading up on this and experimenting with Elasticsearch, and it is looking more and more like single strings with ":" separated key value pairs is easiest to deal with.

As noted in the previous comments, when using an Object approach for key:value pairs, it is not possible to search where the key matches the query string. This is a pretty big issue given that any UI would want to provide suggestions as users enter the query.

Turns out the default tokenizer in Elasticsearch wont split on ":" characters so "key:value" will be treated as 1 term, where as "key=value" would be treated as 2 ["key", "value"]. users would still be able to match a query across all keys by searching "*:value" (the ':' needs to be escaped)

{  
  "query": {  
    "query_string": {  
      "fields": [ "tags" ],
      "query": "*\\:foo1"
    }
  }
}

from metrictank.

Dieterbe commented on May 18, 2024

ok so you're saying we would store tags as an array of strings like "key:val" instead of using "=" ?

does this hamper or reinforce the feasability of key-less tags, and why?

from metrictank.

woodsaj commented on May 18, 2024

So yes, i am saying stick with an array of strings. Splitting the string into key/value pairs would be left up to the client querying the index (graphite/grafana). Though for simplicity it using a colon ":" as the deliminator is the preferred approach.

This approach will allow users to use key-less tags. Using key-less tags will however limit capabilities as the user wont be able to perform groupBy, AliasBy etc.. style transformations.

from metrictank.

Dieterbe commented on May 18, 2024

okay. so actionable items?

move to []string instead of a map and use : as delimiter?

probably best if we can also put this change as part of the nsq migration.

from metrictank.

woodsaj commented on May 18, 2024

yes. ill update the collectors so they send strings.

from metrictank.

torkelo commented on May 18, 2024

comment to mark this as answered in codetree

from metrictank.

woodsaj commented on May 18, 2024

this has already been deployed to production. Metric definitions and events are using []string for tags.

from metrictank.

should all tags have user defined keys? about metrictank HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent