Giter VIP home page Giter VIP logo

graphite-clickhouse's People

Contributors

almostinf avatar blind-oracle avatar bzed avatar civil avatar dependabot[bot] avatar felixoid avatar hedius avatar hipska avatar lexx-bright avatar lomik avatar lordvidex avatar mchrome avatar mrodikov avatar msaf1980 avatar presto53 avatar ravismsk avatar sergeyignatov avatar tetrergeru avatar tsukanov-as avatar wowverylogin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

graphite-clickhouse's Issues

missing values for 1s precision under stress test

I am running graphite-clickhouse with the recommended docker-compose and I am conducting a stress test with a command line tool.

I publish metrics every second from metrics like STRESS.host.ip-0.com.graphite.stresser.a.count, I changed the rollup.xml to test its performance such like:

<yandex>
    <graphite_rollup>
        <pattern>
            <regexp>^STRESS\.</regexp>
            <function>avg</function>
            <retention>
                <age>0</age>
                <precision>1</precision>
            </retention>
            <retention>
                <age>120</age>
                <precision>5</precision>
            </retention>
            <retention>
                <age>600</age>
                <precision>60</precision>
            </retention>
        </pattern>
        <default>
            <function>avg</function>
            <retention>
                <age>0</age>
                <precision>60</precision>
            </retention>
            <retention>
                <age>2592000</age>
                <precision>3600</precision>
            </retention>
        </default>
    </graphite_rollup>
</yandex>

(I tried with avg function as well but the problem persisted.)

It is returning inconsistent, gapped points:

watch -n 1 "curl 'localhost:8080/render/?target=STRESS.host.ip-0.com.graphite.stresser.a.count&format=csv&from=-30s'"

Peek 2019-04-03 15-56

watch -n 1 "curl 'localhost:8080/render/?target=STRESS.host.ip-0.com.graphite.stresser.a.count&format=csv&from=-150s&to=-120'"

Peek 2019-04-03 16-01

[Question] Using replicated tables and carbonapi

Hi,

I'm trying to set up carbon-clickhouse, ClickHouse and graphite-clickhouse with replicated ClickHouse tables. Now my question is, if using carbonapi, should I configure all servers as endpoint, or only 1 (e.g. with VIP) and switch to another when the one is down?

I don't know if there is any other place (forum, chat, .. ) to ask questions, so trying here.

"deleted" for index tables

Hi,

the old graphite_tree table had a "deleted" row to mark Paths as deleted - which was very helpful in those cases where badly named series appeared for whatever reason.
Is there a way to make this work with the index table?

Thanks,
Bernd

Retention policy isn't respected

I have a metric called my.something_sum with the following rollup rules setup in graphite-clickhouse and clickhouse-server:

<pattern>
        <regexp>^my\..*_sum(\.|$)</regexp>
        <function>sum</function>
        <retention>
                <age>0</age>
                <precision>60</precision>
        </retention>
        <retention>
                <age>1209600</age>
                <precision>900</precision>
        </retention>
        <retention>
                <age>5184000</age>
                <precision>86400</precision>
        </retention>
 </pattern>

So, this request /render/?target=my.something_sum&from=-64d&format=csv can return either a list of values with resolution = 60sec or resolution = 86400sec, but I expect the resolution to be 86400sec.

The problem, as I see, lies here https://github.com/lomik/graphite-clickhouse/blob/138d67073ca210eceb7f5c8efa31814c220d0877/helper/rollup/rollup.go#L215

Here the first point's timestamp is analyzed. In a situation when the event represented by this metric is rare we have the first non-null value (only non-null values are present in clickhouse) with timestamp being less than age = 5184000, so custom precision isn't applied.

Graphite-web always applies the resolution corresponding to from field in request, not the age of the oldest metric's point.

Isn't it a problem?

Tags performance

Requests for tagged metrics are slow when there are many tags in ClickHouse. In our test case we had 3 tags and 10K values for each, metrics were generated for period of one year. So a seriesByTag request took about 1 second.

I found that graphite-clickhouse makes 2 queries to ClickHouse when processing seriesByTag requests:

  1. Query to get Paths by tags (from graphite_tagged table).

  2. Query to get metrics by these Paths.

The first query is pretty slow (0.6 s in our test). Can we make the second query without it, just doing
WHERE Path LIKE '%tag1_name=tag2_value%' AND Path LIKE '%tag2_name=tag2_value'...
instead of
WHERE Path IN (<results of the first query>)?
It is much faster, takes just 0.04 s on the same test data.

Maybe there are some cases when original queries are faster and it would be better to have a config option to enable the proposed changes.

Are there any issues with this? If no, then I can implement it and make a PR.

Don't list old metric names?

Is there a way to filter the metric name list so that I can exclude metrics that have not been inserted in the past few days? I can see in the RU wiki page there is detail about the old index_tree table where you could set Deleted = 1 but can't seem to find any documentation about how to do this currently?

bufio.Scanner: token too long

I use graphite-clickhouse buuild from current master,
config:

[common]
listen = ":9090"
max-cpu = 8
memory-return-interval = "0s"
max-metrics-in-find-answer = 0

[clickhouse]
url = "http://127.0.0.1:8123/"
extra-prefix = ""
data-table = "graphite.points_cluster"
data-timeout = "1m0s"
rollup-conf = "/etc/graphite-clickhouse/graphite_rollup.xml"
index-table = "graphite.index_cluster"
index-use-daily = true
index-timeout = "1m"
tagged-table = "graphite.tagged_cluster"

[prometheus]
external-url = "https://grafana.example.com/prometheus"
page-title = "Prometheus Time Series Collection and Processing Server"

[[data-table]]
table = "graphite.points_cluster"
reverse = true
rollup-conf = "/etc/graphite-clickhouse/graphite_rollup.xml"

[[logging]]
logger = ""
file = "stdout"
level = "info"
encoding = "mixed"
encoding-time = "iso8601"
encoding-duration = "seconds"

graphite_rollup:

       <pattern>
            <regexp>^.*scrape_interval=10s.*$</regexp>
            <function>avg</function>
            <retention>
                <age>0</age>
                <precision>10</precision>
            </retention>
            <retention>
                <age>259200</age>
                <precision>60</precision>
            </retention>
            <retention>
                <age>2592000</age>
                <precision>600</precision>
            </retention>
        </pattern>

Collect 10s metric with telegraf and send to carbon-clickhouse.
If request metric for period above 7d - get error:

Object
xhrStatus:"complete"
request:Object
method:"GET"
url:"api/datasources/proxy/18/api/v1/query_range?query=mem_used_percent%7Bhost%3D%22metric%22%7D&start=1568832000&end=1569523200&step=600&timeout=10s"
response:Object
status:"error"
errorType:"execution"
error:"bufio.Scanner: token too long"
message:"bufio.Scanner: token too long"

in log graphite-clickhouse:

[2019-09-26T21:42:56.285+0300] INFO [query] query {"query": "SELECT splitByChar('=', Tag1)[2] as value FROM graphite.tagged_cluster WHERE (Tag1 LIKE 'host=%') AND (Date >= '2019-09-19') GROUP BY value ORDER BY value", "request_id": "81fe3ec423d58c03e75dccd575f9c19a", "time": 0.00851583}
[2019-09-26T21:42:56.285+0300] INFO access {"request_id": "81fe3ec423d58c03e75dccd575f9c19a", "grafana": "Org:1; Dashboard:; Panel:", "time": 0.008859073, "method": "GET", "url": "/api/v1/label/host/values", "peer": "[::1]:6660", "status": 200}
[2019-09-26T21:42:56.351+0300] INFO [query] query {"query": "SELECT Path FROM graphite.tagged_cluster  WHERE ((Tag1='__name__=mem_used_percent') AND (arrayExists((x) -> x='host=metric', Tags))) AND (Date >='2019-09-18' AND Date <= '2019-09-26') GROUP BY Path", "request_id": "5d67a79bb41a4e85e2d6c5eea1038326", "time": 0.011381311}
[2019-09-26T21:42:56.394+0300] INFO access {"request_id": "5d67a79bb41a4e85e2d6c5eea1038326", "grafana": "Org:1; Dashboard:400; Panel:3", "time": 0.054455356, "method": "GET", "url": "/api/v1/query_range?query=mem_used_percent%7Bhost%3D%22metric%22%7D&start=1568832000&end=1569523200&step=600&timeout=10s", "peer": "[::1]:6660", "status": 422}

Request only one metric, if request data, for example, 1 day -10 days ago, the data is displayed

backtraces in graphite-web

Looking at the format=json outfor for a recently added metric, I see

[{"target": "collectd.........", "datapoints": [[null, 1492522390], [null, 1492522400], [null, 1492522410], [null, 1492522420], [null, 1492522430], [null, 1492522440], [null, 1492522450], [null, 1492522460], [null, 1492522470], [null, 1492522480], [null, 1492522490], [null, 1492522500], [null, 1492522510], [null, 1492522520], [null, 1492522530], [null, 1492522540], [null, 1492522550], [null, 1492522560], [null, 1492522570], [null, 1492522580], [null, 1492522590], [null, 1492522600], [null, 1492522610], [null, 1492522620], [null, 1492522630], [null, 1492522640], [null, 1492522650], [null, 1492522660],
........,
[NaN, 1492606510], [0.0, 1492606520], [0.0, 1492606530], [0.0, 1492606540], [0.0, 1492606550], [0.0, 1492606560], [0.0, 1492606570], [0.0, 1492606580], [0.0, 1492606590], [0.0, 1492606600], [0.0, 1492606610], [0.0, 1492606620], [0.0, 1492606630], [0.0, 1492606640], [0.0, 1492606650], [0.0, 1492606660], [0.0, 1492606670], [0.0, 1492606680], [0.0, 1492606690], [0.0, 1492606700], [0.0, 1492606710], [0.0, 1492606720], [0.0, 1492606730], [0.0, 1492606740], [0.0, 1492606750], [0.0, 1492606760], [0.0, 1492606770], [0.0, 1492606780], [0.0, 1492606790], [0.0, 1492606800], [0.0, 1492606810], [0.0, 1492606820], [0.0, 1492606830], [0.0, 1492606840], [0.0, 1492606850], [0.0, 1492606860], [0.0, 1492606870], [0.0, 1492606880], [0.0, 1492606890], [0.0, 1492606900], [0.0, 1492606910], [0.0, 1492606920], [0.0, 1492606930], [0.0, 1492606940], [0.0, 1492606950], [0.0, 1492606960], [0.0, 1492606970], [0.0, 1492606980], [0.0, 1492606990], [0.0, 1492607000], [0.0, 1492607010], [0.0, 1492607020], [0.0, 1492607030], [0.0, 1492607040], [0.0, 1492607050], [0.0, 1492607060], [0.0, 1492607070], [0.0, 1492607080], [0.0, 1492607090], [0.0, 1492607100], [0.0, 1492607110], [0.0, 1492607120], [0.0, 1492607130], [0.0, 1492607140], [0.0, 1492607150], [0.0, 1492607160], [0.0, 1492607170], [0.0, 1492607180], [0.0, 1492607190], [0.0, 1492607200], [0.0, 1492607210], [0.0, 1492607220], [0.0, 1492607230], [0.0, 1492607240], [0.0, 1492607250], [0.0, 1492607260], [0.0, 1492607270], [0.0, 1492607280], [0.0, 1492607290], [0.0, 1492607300], [0.0, 1492607310], [0.0, 1492607320], [0.0, 1492607330], [0.0, 1492607340], [0.0, 1492607350], [0.0, 1492607360], [0.0, 1492607370], [0.0, 1492607380], [0.0, 1492607390], [0.0, 1492607400], [0.0, 1492607410], [0.0, 1492607420], [0.0, 1492607430], [0.0, 1492607440], [0.0, 1492607450], [0.0, 1492607460], [0.0, 1492607470], [0.0, 1492607480], [0.0, 1492607490], [0.0, 1492607500], [0.0, 1492607510], [0.0, 1492607520], [0.0, 1492607530], [0.0, 1492607540], [0.0, 1492607550], [0.0, 1492607560], [0.0, 1492607570], [0.0, 1492607580], [0.0, 1492607590], [0.0, 1492607600], [0.0, 1492607610], [0.0, 1492607620], [0.0, 1492607630], [0.0, 1492607640], [0.0, 1492607650], [0.0, 1492607660], [0.0, 1492607670], [0.0, 1492607680], [0.0, 1492607690], [0.0, 1492607700], [0.0, 1492607710], [0.0, 1492607720], [0.0, 1492607730], [0.0, 1492607740], [0.0, 1492607750], [0.0, 1492607760], [0.0, 1492607770], [0.0, 1492607780], [0.0, 1492607790], [0.0, 1492607800], [0.0, 1492607810], [0.0, 1492607820], [0.0, 1492607830], [0.0, 1492607840], [0.0, 1492607850], [0.0, 1492607860], [0.0, 1492607870], [0.0, 1492607880], [0.0, 1492607890], [0.0, 1492607900], [0.0, 1492607910], [0.0, 1492607920], [0.0, 1492607930], [0.0, 1492607940], [0.0, 1492607950], [0.0, 1492607960], [0.0, 1492607970], [0.0, 1492607980], [0.0, 1492607990], [0.0, 1492608000], [0.0, 1492608010], [0.0, 1492608020], [0.0, 1492608030], [0.0, 1492608040], [0.0, 1492608050], [0.0, 1492608060], [0.0, 1492608070], [0.0, 1492608080], [0.0, 1492608090], [0.0, 1492608100], [0.0, 1492608110], [0.0, 1492608120], [0.0, 1492608130], [0.0, 1492608140], [0.0, 1492608150], [0.0, 1492608160], [0.0, 1492608170], [0.0, 1492608180], [0.0, 1492608190], [0.0, 1492608200], [0.0, 1492608210], [0.0, 1492608220], [0.0, 1492608230], [0.0, 1492608240], [0.0, 1492608250], [0.0, 1492608260], [0.0, 1492608270], [0.0, 1492608280], [0.0, 1492608290], [0.0, 1492608300], [0.0, 1492608310], [0.0, 1492608320], [0.0, 1492608330], [0.0, 1492608340], [0.0, 1492608350], [0.0, 1492608360], [0.0, 1492608370], [0.0, 1492608380], [0.0, 1492608390], [0.0, 1492608400], [0.0, 1492608410], [0.0, 1492608420], [0.0, 1492608430], [0.0, 1492608440], [0.0, 1492608450], [0.0, 1492608460], [0.0, 1492608470], [0.0, 1492608480], [0.0, 1492608490], [0.0, 1492608500], [0.0, 1492608510], [0.0, 1492608520], [0.0, 1492608530], [0.0, 1492608540], [0.0, 1492608550], [0.0, 1492608560], [0.0, 1492608570], [0.0, 1492608580], [0.0, 1492608590], [0.0, 1492608600], [0.0, 1492608610], [0.0, 1492608620], [0.0, 1492608630], [0.0, 1492608640], [0.0, 1492608650], [0.0, 1492608660], [0.0, 1492608670], [0.0, 1492608680], [0.0, 1492608690], [0.0, 1492608700], [0.0, 1492608710], [0.0, 1492608720], [0.0, 1492608730], [0.0, 1492608740], [0.0, 1492608750], [0.0, 1492608760], [0.0, 1492608770], [null, 1492608780]]}]

Trying to look at the format=png image of such a query results in the following backtrace:

Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/django/core/handlers/base.py", line 111, in get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/usr/lib/python2.7/dist-packages/graphite/render/views.py", line 215, in renderView
    image = doImageRender(requestOptions['graphClass'], graphOptions)
  File "/usr/lib/python2.7/dist-packages/graphite/render/views.py", line 436, in doImageRender
    img = graphClass(**graphOptions)
  File "/usr/lib/python2.7/dist-packages/graphite/render/glyph.py", line 196, in __init__
    self.drawGraph(**params)
  File "/usr/lib/python2.7/dist-packages/graphite/render/glyph.py", line 678, in drawGraph
    self.setupYAxis()
  File "/usr/lib/python2.7/dist-packages/graphite/render/glyph.py", line 1137, in setupYAxis
    self.yLabelWidth = max([self.getExtents(label)['width'] for label in self.yLabels])
ValueError: max() arg is an empty sequence

Right now I have no idea where this issue comes from, it might be the NaN value. With the normal graphite whisper backend the issue does not happen at all.

Please implement target-whitelist

Hi,

it would be nice to have the option to blacklist all metrics and whitelist a few chosen ones.
I might come up with a PR at some point, but I have the hope that you are faster ;)

Thanks,
Bernd

Debugging/improving poor performance?

I'm currently using carbonapi & go-carbon but would like to switch to clickhouse for ingest performance. The following query takes <1s to run over the past 3h of data (each * expands to roughly 500 metrics):

aliasByNode(highestAverage(group( 
scale(divideSeriesLists($prefix.$group.$server.cloudlinux.*.counter.CPU,$prefix.$group.$server.cloudlinux.*.gauge.lCPU), 0.00001),         
divideSeriesLists($prefix.$group.$server.cloudlinux.*.counter.IOPS,$prefix.$group.$server.cloudlinux.*.gauge.lIOPS),         
divideSeriesLists($prefix.$group.$server.cloudlinux.*.counter.IO,$prefix.$group.$server.cloudlinux.*.gauge.lIO),         
divideSeriesLists($prefix.$group.$server.cloudlinux.*.gauge.NPROC,$prefix.$group.$server.cloudlinux.*.gauge.lNPROC),         
divideSeriesLists($prefix.$group.$server.cloudlinux.*.gauge.EP,$prefix.$group.$server.cloudlinux.*.gauge.lEP),         
divideSeriesLists($prefix.$group.$server.cloudlinux.*.gauge.MEMPHY,$prefix.$group.$server.cloudlinux.*.gauge.lMEMPHY)
)  , 10),4,6)

When I switch carbonapi to point at graphite-clickhouse it now takes ~27sec for the same query ie ~30-60* slower. How can I debug this and improve the performance?

[Question] Metrics sorting and parallel parse

Hello!

I'm running some benchmarks between graphite-clickhouse and go-carbon.
One of the test cases is a query that returns around 3k metrics with 114 millions data points, for which go-carbon is considerably faster. I can share more details about the setup and the test if that helps.

I had a look at the code and was wondering why the sorting is implemented in graphite-clickhouse instead of relying on ORDER BY (Path,Time) in Clickhouse.

[2019-08-06T07:59:31.195Z] INFO render {"request_id": "c1c74d80768cc1363a839879dad61a36", "read_bytes": 14907614050, "read_points": 113399566}

[2019-08-06T07:59:31.196Z] DEBUG parse {"request_id": "c1c74d80768cc1363a839879dad61a36", "runtime": "1m11.01137844s", "runtime_ns": 71.01137844}

[2019-08-06T08:00:08.777Z] DEBUG sort {"request_id": "c1c74d80768cc1363a839879dad61a36", "runtime": "37.581330127s", "runtime_ns": 37.581330127}

[2019-08-06T08:00:22.976Z] DEBUG reply {"request_id": "c1c74d80768cc1363a839879dad61a36", "runtime": "12.760293752s", "runtime_ns": 12.760293752}

Have you also considered parallelizing the parse?

I'm basically looking into improving overall run time for heavy queries.

Thanks!

"runtime error: invalid memory address or nil pointer dereference" on tags/autoComplete/tags request, graphite-clickhouse v 0.11.6

Hi.

I have installed new version 0.11.6 and now I have a problem with autoComplete/tag

I try to request all tags in Grafana (dropdow in interface) and there is no tags. It is calling tags/autoComplete/tags

I see in carbonapi logs
ERROR zipper error fetching result {"type": "protoV2Group", "name": "clickhouse-cluster", "type": "tagName", "function": "HttpQuery.doRequest", "server": "http://graphite-clickhouse:9090", "name": "clickhouse-cluster", "uri": "http://graphite-clickhouse:9090/tags/autoComplete/tags", "error": "Get http://graphite-clickhouse:9090/tags/autoComplete/tags: EOF"}

I see in logs graphite-clickhouse next message

[2019-07-15T05:09:15.687Z] INFO access {"request_id": "744edfcb576e13af8d578e289f0fc50c", "grafana": "Org:6; Dashboard:; Panel:", "time": 0.006001445, "method": "GET", "url": "/metrics/find/?format=protobuf&query=%2A", "peer": "10.0.0.29:40438", "status": 200}
2019/07/15 05:09:15 http: panic serving 10.0.0.29:40438: runtime error: invalid memory address or nil pointer dereference
goroutine 1003 [running]:
net/http.(*conn).serve.func1(0xc0003de3c0)
	/usr/local/go/src/net/http/server.go:1769 +0x139
panic(0x16bbd60, 0x2cac7c0)
	/usr/local/go/src/runtime/panic.go:522 +0x1b5
github.com/lomik/graphite-clickhouse/pkg/where.(*Where).And(0x0, 0xc0007bc400, 0x14)
	/go/src/github.com/lomik/graphite-clickhouse/_vendor/src/github.com/lomik/graphite-clickhouse/pkg/where/where.go:126 +0x37
github.com/lomik/graphite-clickhouse/pkg/where.(*Where).Andf(0x0, 0x18aa925, 0xc, 0xc000a63870, 0x1, 0x1)
	/go/src/github.com/lomik/graphite-clickhouse/_vendor/src/github.com/lomik/graphite-clickhouse/pkg/where/where.go:145 +0x75
github.com/lomik/graphite-clickhouse/autocomplete.(*Handler).ServeTags(0xc0007bb990, 0x1d4fce0, 0xc00050a820, 0xc000742100)
	/go/src/github.com/lomik/graphite-clickhouse/_vendor/src/github.com/lomik/graphite-clickhouse/autocomplete/autocomplete.go:118 +0x267
github.com/lomik/graphite-clickhouse/autocomplete.(*Handler).ServeHTTP(0xc0007bb990, 0x1d4fce0, 0xc00050a820, 0xc000742100)
	/go/src/github.com/lomik/graphite-clickhouse/_vendor/src/github.com/lomik/graphite-clickhouse/autocomplete/autocomplete.go:45 +0x80
main.Handler.func1(0x1d50020, 0xc0004a41c0, 0xc000742000)
	/go/src/github.com/lomik/graphite-clickhouse/_vendor/src/github.com/lomik/graphite-clickhouse/graphite-clickhouse.go:65 +0xe4
net/http.HandlerFunc.ServeHTTP(0xc00000f3a0, 0x1d50020, 0xc0004a41c0, 0xc000742000)
	/usr/local/go/src/net/http/server.go:1995 +0x44
net/http.(*ServeMux).ServeHTTP(0x2cd1720, 0x1d50020, 0xc0004a41c0, 0xc000742000)
	/usr/local/go/src/net/http/server.go:2375 +0x1d6
net/http.serverHandler.ServeHTTP(0xc00047d520, 0x1d50020, 0xc0004a41c0, 0xc000742000)
	/usr/local/go/src/net/http/server.go:2774 +0xa8
net/http.(*conn).serve(0xc0003de3c0, 0x1d5b1a0, 0xc000702700)
	/usr/local/go/src/net/http/server.go:1878 +0x851
created by net/http.(*Server).Serve
	/usr/local/go/src/net/http/server.go:2884 +0x2f4

If is getting request with defined tags all works correctly. Example, request seriesByTag('service_name=tech_consul_node', 'host=tech-node3', 'cpu=cpu-total')

rollup-conf and reverse Path

Hello,

I'm using the reversed Path for the data table. It works well however the rollup conf in graphite-clickhouse does not seem to work with regexps on the reversed Path. So I have to maintain 2 rollup.xml files: 1 for graphite-clickhouse that applies to the non-reversed Path and another for clickhouse that works on the reversed Path. For example:
For graphite-clickhouse:

        <pattern>
                <regexp>^carbon\.</regexp>
                <function>avg</function>
                <retention>
                        <age>0</age>
                        <precision>60</precision>
                </retention>
                <retention>
                        <age>1296000</age>
                        <precision>600</precision>
                </retention>
        </pattern>

And for CH:

        <pattern>
                <regexp>\.carbon$</regexp>
                <function>avg</function>
                <retention>
                        <age>0</age>
                        <precision>60</precision>
                </retention>
                <retention>
                        <age>1296000</age>
                        <precision>600</precision>
                </retention>
        </pattern>

Would it be possible to use the same logic in graphite-clickhouse as in CH?

[IDEA] Use ClickHouse feature `External Data for Query Processing` for data fetching

UPD: This feature could be used to avoid problems with too long queries.

Besides, the maximum metrics per query should be checked


Hello. I would like to implement a feature to mitigate an error DB::Exception: Syntax error: failed at position 262125: ..... Max query size exceeded

The idea is to get the current user setting from ClickHouse. Then split the query https://github.com/lomik/graphite-clickhouse/blob/master/render/handler.go#L162 to multiple and send then in parallel. The setting could be refreshed as well as rollup config in background once per minute.

What do you think?

Better HTTP error handling

I've got the following concerns about current http error handling:

  1. There is no way on the client to find out what's this error about.
  2. In case of unsupported parameter you should return an error (e.x. for format)

Tags table not populating

After getting tags correctly, I'm unable to populate the tags table using the command, nor get any results when directly talking to the api.

[root@core-clickhouse01 log]# curl http://127.0.0.1:9090/tags/autoComplete/tags?pretty=1&limit=100
[1] 4937
[root@core-clickhouse01 log]# clickhouse response status 404: Code: 46, e.displayText() = DB::Exception: Unknown table function WHERE, e.what() = DB::Exception
[root@core-clickhouse01 log]# graphite-clickhouse -tags
[root@core-clickhouse01 log]# clickhouse-client
ClickHouse client version 1.1.54380.
Connecting to localhost:9000.
Connected to ClickHouse server version 1.1.54380.

core-clickhouse01 :) select * from graphite_tag;

SELECT *
FROM graphite_tag 

┌───────Date─┬─Level─┬─Tag1─┬─Path─┬─IsLeaf─┬─Tags─┬────Version─┐
│ 2016-11-01 │     0 │      │      │      0 │ []   │ 1525973670 │
└────────────┴───────┴──────┴──────┴────────┴──────┴────────────┘
┌───────Date─┬─Level─┬─Tag1─┬─Path─┬─IsLeaf─┬─Tags─┬────Version─┐
│ 2016-11-01 │     0 │      │      │      0 │ []   │ 1525958477 │
└────────────┴───────┴──────┴──────┴────────┴──────┴────────────┘

2 rows in set. Elapsed: 10.025 sec. 

I'm feeding in tagged data

SELECT *
FROM graphite 
WHERE Path LIKE 'system.%'
LIMIT 10

┌─Path───────────────────────────────────────────┬─Value─┬───────Time─┬───────Date─┬──Timestamp─┐
│ system.core.count?host=core-dddev01&type=gauge │     1 │ 1525958330 │ 2018-05-10 │ 1525958354 │
│ system.core.count?host=core-dddev01&type=gauge │     1 │ 1525958350 │ 2018-05-10 │ 1525958369 │
│ system.core.count?host=core-dddev01&type=gauge │     1 │ 1525958360 │ 2018-05-10 │ 1525958389 │
│ system.core.count?host=core-dddev01&type=gauge │     1 │ 1525958380 │ 2018-05-10 │ 1525958414 │
│ system.core.count?host=core-dddev01&type=gauge │     1 │ 1525958410 │ 2018-05-10 │ 1525958444 │
│ system.core.count?host=core-dddev01&type=gauge │     1 │ 1525958450 │ 2018-05-10 │ 1525958479 │
│ system.core.count?host=core-dddev01&type=gauge │     1 │ 1525958480 │ 2018-05-10 │ 1525958519 │
│ system.core.count?host=core-dddev01&type=gauge │     1 │ 1525958530 │ 2018-05-10 │ 1525958564 │
│ system.core.count?host=core-dddev01&type=gauge │     1 │ 1525958590 │ 2018-05-10 │ 1525958614 │
│ system.core.count?host=core-dddev01&type=gauge │     1 │ 1525958630 │ 2018-05-10 │ 1525958669 │
└────────────────────────────────────────────────┴───────┴────────────┴────────────┴────────────┘

10 rows in set. Elapsed: 0.056 sec. Processed 24.58 thousand rows, 2.09 MB (442.29 thousand rows/s., 37.63 MB/s.) 

graphite-conifg.conf

[common]
listen = ":9090"
max-cpu = 1

[clickhouse]
url = "http://localhost:8123"
data-table = "graphite"
tree-table = "graphite_tree"
date-tree-table = ""
date-tree-table-version = 0
rollup-conf = "/etc/graphite-clickhouse/rollup.xml"
tag-table = "graphite_tag"
extra-prefix = ""
data-timeout = "1m0s"
tree-timeout = "1m0s"

[carbonlink]
server = ""
threads-per-request = 10
connect-timeout = "50ms"
query-timeout = "50ms"
total-timeout = "500ms"

[[logging]]
logger = ""
file = "/var/log/graphite-clickhouse.log"
level = "debug"
encoding = "mixed"
encoding-time = "iso8601"
encoding-duration = "seconds"

carbon-clickhouse.conf

[common]
metric-prefix = "carbon.ck-agents.{host}"
metric-endpoint = "local"
metric-interval = "30s"
max-cpu = 2

[logging]
file = "/var/log/carbon-clickhouse.log"
level = "debug"

[data]
path = "/var/local/carbon-clickhouse/data"
chunk-interval = "1s"
chunk-auto-interval = ""

[upload.graphite]
type = "points"
table = "graphite"
threads = 1
url = "http://localhost:8123/"
timeout = "1m0s"

[upload.graphite_tree]
type = "tree"
table = "graphite_tree"
date = "2016-11-01"
threads = 1
url = "http://localhost:8123/"
timeout = "1m0s"
cache-ttl = "12h0m0s"

[upload.graphite_tagged]
type = "tagged"
table = "graphite_tagged"
threads = 1
url = "http://localhost:8123/"
timeout = "1m0s"
cache-ttl = "12h0m0s"
[udp]
listen = ":2003"
enabled = true
[tcp]
listen = ":2003"
enabled = true

The data in question looks like this:

system.swap.total;host=core-dddev01;type=gauge 0 1525976850
system.swap.free;host=core-dddev01;type=gauge 0 1525976850
system.mem.usable;host=core-dddev01;type=gauge 630.55859375 1525976850
system.mem.pct_usable;host=core-dddev01;type=gauge 0.6871520032692537 1525976850

I think I have all my bases covered, so I must be missing something.

No data from default rollup policy

Hi,

We are using carbon-clickhouse as remote storage for prometheus, with tags (graphite + graphite_tagged tables). Data is sent every 15s.

graphite-clickhouse does not return data for series less than 1 hour ago with the following rollup config:

    <graphite_rollup>
        <pattern>
            <regexp>nginx_http_stats_requests</regexp>
            <function>sum</function>
            <retention>
                <age>3600</age>
                <precision>60</precision>
            </retention>
        </pattern>
        <default>
            <function>any</function>
            <retention>
                <age>0</age>
                <precision>15</precision>
            </retention>
        </default>
    </graphite_rollup>

grafana request to graphite-web:

aliasByTags(groupByTags(perSecond(seriesByTag('name=nginx_http_stats_requests_count', 'region=~${region:regex}', 'job=~$role', 'http_host=~$http_host','http_code=~$http_code')), 'sum', 'hostname','http_code'), 'hostname', 'http_code')

When requesting data for the last 3 hours, it returns as expected, graph is populated up to last minute.
When requesting last 30 minutes, all datapoints are null.

404 when returning empty metric set

Hi,
I trust you and your loved ones are well and safe.

I'm probably misunderstanding things here so your patience is appreciated.

It appears that when an empty data set is returned it is done so with a http 404 status.
I'm honestly not sure if this is intentional but the behaviour appears to trigger the retry functionality in carbonapi, it could be that it should not...

So my question is, should graphite-clickhouse return a 404 when correctly returning an empty data set?

Thanks.

Bug with regex alternate

Query like this

seriesByTag('type_instance=~nonpaged|active|used|wired')

incorrectly translating into

Tag1 LIKE 'type\\\\_instance=nonpaged%' AND match(Tag1, 'type_instance=nonpaged|active|used|wired')

and simply won't work as expected because first value from alternate activates in LIKE and effectively blocking any other alternate. Seems like source of the problem lies in https://github.com/lomik/graphite-clickhouse/blob/master/pkg/where/where.go#L31 . It should be something like:

Tag1 LIKE 'type\\\\_instance=%' AND match(Tag1, 'type_instance=nonpaged|active|used|wired')

And even there regex is incorrect, it matches type_instance=nonpaged or active or used or wired. Wrapping regex value into (?: and ) will produce sane results.

As a temporary workaround I'm wrapping right side of match expression into (?: and ) myself and everything works as expected. For example:

seriesByTag('plugin=memory', 'type_instance=~(?:nonpaged|active|used|wired)')

[Feature Request] Clickhouse Native TCP Interface support.

With a big dataset http interface performs considerabley slower.
Clickhouse version: 20.3.12.112-stable

export QUERY="SELECT Path, Time, Value, Timestamp 
FROM graphite_points 
PREWHERE Date >='2020-06-25' AND Date <= '2020-07-02'
WHERE 
Time >= 1593063300 
AND Time <= 1593668401
AND Path IN (SELECT Path FROM graphite_tagged  WHERE Tag1='__name__=notifications_latency_ms_bucket' AND Date >='2020-06-25' AND Date <= '2020-07-02' GROUP BY Path) 
FORMAT RowBinary;"

$ time clickhouse-client -h 127.0.0.1 --port 9000 <<< $QUERY | wc -c
5596887430

real    0m8.670s
user    0m3.551s
sys     0m4.532s
$ time curl -s --data-binary @- 'http://127.0.0.1:8123/' <<< $QUERY | wc -c
5596887430

real    0m22.171s
user    0m6.714s
sys     0m15.519s

How to configure logging

Hi,
we use graphite-clickhouse and carbon-clickhouse in production and are very happy with the performance improvements over our previous graphite stack.
the query logging is too much for us but we would like to keep the access logging.
So, ideally we would like to log query at warn and everything else at info.
We have unsuccessfully tried this configuration but feel that we are misunderstanding how this should work:

[[logging]]
logger = "query"
file = "/var/log/graphite-clickhouse/graphite-clickhouse.log"
level = "warn"
encoding = "mixed"
encoding-time = "iso8601"
encoding-duration = "seconds"

[[logging]]
logger = ""
file = "/var/log/graphite-clickhouse/graphite-clickhouse.log"
level = "info"
encoding = "mixed"
encoding-time = "iso8601"
encoding-duration = "seconds"

we reversed the ordering and tried playing around with it in general but the result is the same - when we set any logger to be warn we get no logs, we dont see many warnings or errors so I guess we may not have been patient enough for them but I would have expected to see many access log lines.

Cheers!

  • the friendly telemetry team at UK HMRC (tax people.)

not work min-age for data-table

in what there can be an error in a configuration?
do not go to the table in the archive

[common]
listen = ":9090"
max-cpu = 8
# Daemon returns empty response if query matches any of regular expressions
# target-blacklist = ["^not_found.*"]

[clickhouse]
# You can add user/password (http://user:password@localhost:8123) and any clickhouse options (GET-parameters) to url
# It is recommended to create read-only user
url = "http://localhost:8123"
data-table = "graphite.points_cluster"
tree-table = "graphite.tree_cluster"
# Optional table with daily series list.
# Useful for installations with big count of short-lived series
date-tree-table = "graphite.series_daily_cluster"
#date-tree-table = ""
# Supported several schemas of date-tree-table:
# 1 (default): table only with Path, Date, Level fields. Described here: https://habrahabr.ru/company/avito/blog/343928/
# 2: table with Path, Date, Level, Deleted, Version fields. Table type "series" in the carbon-clickhouse
date-tree-table-version = 2
rollup-conf = "/etc/graphite-clickhouse/rollup.xml"
# `tagged` table from carbon-clickhouse. Required for seriesByTag
tagged-table = ""
# Add extra prefix (directory in graphite) for all metrics
extra-prefix = ""
data-timeout = "30s"
tree-timeout = "30s"

[carbonlink]
server = ""
threads-per-request = 10
connect-timeout = "50ms"
query-timeout = "50ms"
total-timeout = "500ms"

[[data-table]]
table = "graphite.points_cluster"
max-age = "240h"
reverse = true
rollup-conf = "/etc/graphite-clickhouse/rollup.xml"

[[data-table]]
table = "graphite.points_archive_cluster"
min-age = "240h"
reverse = true
rollup-conf = "/etc/graphite-clickhouse/rollup_archive.xml"

[[logging]]
logger = ""
file = "/var/log/graphite-clickhouse/graphite-clickhouse.log"
level = "info"
encoding = "mixed"
encoding-time = "iso8601"
encoding-duration = "seconds"

Error with Tag Setup

I'm trying to expose tags, and I'm getting the following errors using v0.6.3:
# graphite-clickhouse -tags
2018/05/09 06:52:44 clickhouse response status 500: Code: 62, e.displayText() = DB::Exception: Syntax error: failed at position 14 (line 1, col 14): (Date,Version,Level,Path,IsLeaf,Tags,Tag1) FORMAT RowBinary �B ��Z. Expected one of: TABLE, identifier, FUNCTION, e.what() = DB::Exception
And in the log file:

2018-05-09T07:36:42.694-0500] INFO [tagger] parse rules {}
[2018-05-09T07:36:42.695-0500] INFO [tagger] parse rules {"time": 0.000354584, "mem_rss_mb": 2}
[2018-05-09T07:36:42.695-0500] INFO [tagger] read and parse tree {}
[2018-05-09T07:36:42.705-0500] INFO [query] query {"query": "SELECT Path FROM graphite_tree WHERE cityHash64(Path) % 10 == 0 GROUP BY Path HAVING argMax(Deleted, Version)==0 FORMAT RowBinary", "request_id": "", "time": 0.010675804}
[2018-05-09T07:36:42.710-0500] INFO [query] query {"query": "SELECT Path FROM graphite_tree WHERE cityHash64(Path) % 10 == 1 GROUP BY Path HAVING argMax(Deleted, Version)==0 FORMAT RowBinary", "request_id": "", "time": 0.004206918}
[2018-05-09T07:36:42.714-0500] INFO [query] query {"query": "SELECT Path FROM graphite_tree WHERE cityHash64(Path) % 10 == 2 GROUP BY Path HAVING argMax(Deleted, Version)==0 FORMAT RowBinary", "request_id": "", "time": 0.003856585}
[2018-05-09T07:36:42.719-0500] INFO [query] query {"query": "SELECT Path FROM graphite_tree WHERE cityHash64(Path) % 10 == 3 GROUP BY Path HAVING argMax(Deleted, Version)==0 FORMAT RowBinary", "request_id": "", "time": 0.003746844}
[2018-05-09T07:36:42.724-0500] INFO [query] query {"query": "SELECT Path FROM graphite_tree WHERE cityHash64(Path) % 10 == 4 GROUP BY Path HAVING argMax(Deleted, Version)==0 FORMAT RowBinary", "request_id": "", "time": 0.004499933}
[2018-05-09T07:36:42.730-0500] INFO [query] query {"query": "SELECT Path FROM graphite_tree WHERE cityHash64(Path) % 10 == 5 GROUP BY Path HAVING argMax(Deleted, Version)==0 FORMAT RowBinary", "request_id": "", "time": 0.005991626}
[2018-05-09T07:36:42.735-0500] INFO [query] query {"query": "SELECT Path FROM graphite_tree WHERE cityHash64(Path) % 10 == 6 GROUP BY Path HAVING argMax(Deleted, Version)==0 FORMAT RowBinary", "request_id": "", "time": 0.004849088}
[2018-05-09T07:36:42.741-0500] INFO [query] query {"query": "SELECT Path FROM graphite_tree WHERE cityHash64(Path) % 10 == 7 GROUP BY Path HAVING argMax(Deleted, Version)==0 FORMAT RowBinary", "request_id": "", "time": 0.00503664}
[2018-05-09T07:36:42.745-0500] INFO [query] query {"query": "SELECT Path FROM graphite_tree WHERE cityHash64(Path) % 10 == 8 GROUP BY Path HAVING argMax(Deleted, Version)==0 FORMAT RowBinary", "request_id": "", "time": 0.004059725}
[2018-05-09T07:36:42.750-0500] INFO [query] query {"query": "SELECT Path FROM graphite_tree WHERE cityHash64(Path) % 10 == 9 GROUP BY Path HAVING argMax(Deleted, Version)==0 FORMAT RowBinary", "request_id": "", "time": 0.00467767}
[2018-05-09T07:36:42.753-0500] INFO [tagger] read and parse tree {"time": 0.057897203, "mem_rss_mb": 4}
[2018-05-09T07:36:42.753-0500] INFO [tagger] sort {}
[2018-05-09T07:36:42.759-0500] INFO [tagger] sort {"time": 0.00616382, "mem_rss_mb": 4}
[2018-05-09T07:36:42.759-0500] INFO [tagger] make map {}
[2018-05-09T07:36:42.760-0500] INFO [tagger] make map {"time": 0.000638004, "mem_rss_mb": 4}
[2018-05-09T07:36:42.760-0500] INFO [tagger] match {}
[2018-05-09T07:36:42.764-0500] INFO [tagger] match {"time": 0.004148708, "mem_rss_mb": 4}
[2018-05-09T07:36:42.764-0500] INFO [tagger] copy tags from childs to parents {}
[2018-05-09T07:36:42.764-0500] INFO [tagger] copy tags from childs to parents {"time": 0.000559527, "mem_rss_mb": 4}
[2018-05-09T07:36:42.764-0500] INFO [tagger] marshal RowBinary + gzip {}
[2018-05-09T07:36:42.765-0500] INFO [tagger] marshal RowBinary + gzip {"time": 0.000569245, "mem_rss_mb": 7}
[2018-05-09T07:36:42.765-0500] INFO [tagger] upload to clickhouse {}
[2018-05-09T07:36:42.794-0500] ERROR [query] query {"query": "INSERT INTO  (Date,Version,Level,Path,IsLeaf,Tags,Tag1) FORMAT RowBinary", "request_id": "", "time": 0.028687229, "error": "clickhouse response status 500: Code: 62, e.displayText() = DB::Exception: Syntax error: failed at position 14 (line 1, col 14): (Date,Version,Level,Path,IsLeaf,Tags,Tag1) FORMAT RowBinary\n\ufffdBZ\ufffd\ufffdZ\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000. Expected one of: TABLE, identifier, FUNCTION, e.what() = DB::Exception\n"}

Clickhouse version is 1.1.54380.

Metrics ClickHouse to Graphite

Hello,

I would like to know how to send the metrics from ClickHouse to Graphite to visualize them from Grafana. Could you help me? Thank you!

Release new version

Hello. Here I'm trying to provide the changelog. If it fits requirements, I'll appreciate so much the new release

Changes since v.0.11.7

## Features

- Add memory-return-interval option
- Decrease amount of transferred data with an aggregation of values and timestamps by Path
    - Increase `scanner.Buffer` to fix `bufio.Scanner: token too long`
- Add `noprom` tag to make prometheus dependency optional
- Upload packages to https://packagecloud.io/go-graphite/

## Bugfix

- Fix metric finder content type for `protobuf`
- Remove unused escape functions 
- Sort labels with __name__
- Allow using `graphite..inner.data` as rollup table
- Fix broken deb-compression fpm argument

Build new release

Hi,

Could it be possible to build new release? Latest build is from 0.9.0. Also a changelog would be nice.

Tag table usage

Please document the usage of the 'tag' table and config files.

Please implement filtering based on http header

Hi,

it would be awesome if graphite-clickhouse could filter series based on an http header (or something similar). What I'm thinking about is basically:

  • pass X-GRAPHITE-CUSTOMER: 12345 in the request
  • have a config file which allows prefixes for the customer:
[12345]
blacklisted-prefixes = ["*"]
whitelisted-prefixes = ["foo", "bar", "carbon"]

With such a configuration graphite-clickhouse should just behave like always, but limit all its output to the metrics matching the whitelisted (or not blacklisted) prefixes.

Thanks for considering,

Bernd

How to use tags right

Hello!
When i trying using tags i get :
$ graphite-clickhouse -tags
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x7b9ad2]

goroutine 1 [running]:
github.com/lomik/graphite-clickhouse/tagger.(*Set).Merge(0x0, 0x0, 0x1104955)
/root/go/src/github.com/lomik/graphite-clickhouse/_vendor/src/github.com/lomik/graphite-clickhouse/tagger/set.go:53 +0x22
github.com/lomik/graphite-clickhouse/tagger.Make(0xc420210000, 0x8, 0x1)
/root/go/src/github.com/lomik/graphite-clickhouse/_vendor/src/github.com/lomik/graphite-clickhouse/tagger/tagger.go:225 +0xd23
main.main()
/root/go/src/github.com/lomik/graphite-clickhouse/_vendor/src/github.com/lomik/graphite-clickhouse/graphite-clickhouse.go:149 +0x3b5

Strange behavior for rolluped data

Hi,
I have set up following rollup:

<graphite_rollup>
       <default>
               <function>avg</function>
               <retention>
                       <age>0</age>
                       <precision>1</precision>
               </retention>
                <retention>
                       <age>86400</age>
                       <precision>60</precision>
               </retention>
               <retention>
                       <age>63072000</age>
                       <precision>86400</precision>
               </retention>
       </default>
</graphite_rollup>

And got expected results for first interval (30 min) but after got null values between actual values returned from graphite (see screenshot).
rolluped data example
Am I misconfigured it?

Basic explanation

Could you explain in a few words what does graphite-clickhouse? I am familiar with basic original Graphite stack:

  • Carbon
  • Whisper
  • Graphite-web (api for Grafana and basic web panel).

Per my understanding in clickhouse case graphite-clickhouse needs to be located before clickhouse for providing the same metrics (as Whisper) for Graphite frontend (Graphite-web or carbonapi etc). Is it correct? Useful links are also appreciated.

Reuse Clickhouse rollup rules

Now it's not possible becuase they have extra element.

Though except of that, files are identical. This makes no sense to keep 2 sets of files, one for clickhouse, another for graphite-clickhouse.

[IDEA] Use arrayGroup for SELECT FROM graphite_data

Hello again. I've done some experiment and got that current SELECT is very ineffective for a big set of data. Here results:

>>> time clickhouse-client --optimize_throw_if_noop 1 -d graphite --log_queries 0 -q "SELECT Path, Time, Value, Timestamp FROM graphite.carbon PREWHERE (Date >= '2019-08-29') AND (Date <= '2019-08-29') WHERE (Path IN _data) FORMAT RowBinary" --external --file=metrics --types=String | wc
 188699 1777390 2455102113

real	0m48.303s
user	0m45.364s
sys	0m2.240s
>>> time clickhouse-client --optimize_throw_if_noop 1 -d graphite --log_queries 0 -q "SELECT Path, Time, Value, Timestamp FROM graphite.carbon PREWHERE (Date >= '2019-08-29') AND (Date <= '2019-08-29') WHERE (Path IN _data) FORMAT Null" --external --file=metrics --types=String | wc
      0       0       0

real	0m13.941s
user	0m1.396s
sys	0m0.464s
>>> time clickhouse-client --optimize_throw_if_noop 1 -d graphite --log_queries 0 -q "SELECT Path, Time, Value, Timestamp FROM graphite.carbon PREWHERE (Date >= '2019-08-29') AND (Date <= '2019-08-29') WHERE (Path IN _data)" --external --file=metrics --types=String | wc
24261245 97044980 2683087586

real	0m43.619s
user	0m43.384s
sys	0m2.388s
>>> time clickhouse-client --optimize_throw_if_noop 1 -d graphite --log_queries 0 -q "SELECT Path, groupArray(Time), groupArray(Value), groupArray(Timestamp) FROM graphite.carbon PREWHERE (Date >= '2019-08-29') AND (Date <= '2019-08-29') WHERE (Path IN _data) GROUP BY Path FORMAT RowBinary" --external --file=metrics --types=String | wc
 188739 1779468 387864252

real	0m28.289s
user	0m14.300s
sys	0m0.576s
>>> time clickhouse-client --optimize_throw_if_noop 1 -d graphite --log_queries 0 -q "SELECT Path, groupArray(Time), groupArray(Value), groupArray(Timestamp) FROM graphite.carbon PREWHERE (Date >= '2019-08-29') AND (Date <= '2019-08-29') WHERE (Path IN _data) GROUP BY Path FORMAT Null" --external --file=metrics --types=String | wc
      0       0       0

real	0m14.882s
user	0m0.200s
sys	0m0.072s
>>> time clickhouse-client --optimize_throw_if_noop 1 -d graphite --log_queries 0 -q "SELECT Path, groupArray(Time), groupArray(Value), groupArray(Timestamp) FROM graphite.carbon PREWHERE (Date >= '2019-08-29') AND (Date <= '2019-08-29') WHERE (Path IN _data) GROUP BY Path" --external --file=metrics --types=String | wc
  25792  103168 597917411

real	0m24.955s
user	0m10.348s
sys	0m0.576s

arrayGroup has a tiny overhead for a calculation, but it more than two times faster for the transfer of the reference data. I've put the TSV format for an idea of the total amount of points.

Do you mind if I try to implement arrayGroup in DataParse?

target-blacklist still exposes blacklisted prefixes

With the follwing blacklist:

target-blacklist = ["^carbon.*","^clickhouse.*"]

the metrics finder still shows that the blacklisted prefixes exist:

$ curl http://10.47.127.37:8081/metrics/find/?query='*'
[{"allowChildren":1,"expandable":1,"leaf":0,"id":"clickhouse","text":"clickhouse","context":{}},{"allowChildren":1,"expandable":1,"leaf":0,"id":"carbon","text":"carbon","context":{}}]

although it hides their content properly:

$ curl http://10.47.127.37:8081/metrics/find/?query='carbon.*'
[]

I'd expect, that the '*' also hides all blacklisted series.

Thanks for fixing!

[IDEA] create CLI/web tool to manage metrics

I posted an issue to the carbon-clickhouse repo some time ago asking how to delete specific metrics.

I ended up finding a way using some SQL queries, but I think this project would greatly benefit from another tool to manage the metrics as a whole.

In the traditional Graphite use, you can rename and delete metrics easily by just renaming or deleting folders/files on disk, with the changes being almost instantly visible in any front-end.

With CH as the backend, you'd need to figure out how to do very specific SQL queries to the same thing.

Would it be possible to create a way to do this? I'm thinking either a standalone tool that talks directly to the CH instance, or maybe a REST API that works via graphite-clickhouse on a new URL (/metrics, or /api/metrics), which would allow for some typical management tasks.

bufio.Scanner

Hi,

as I mentioned in #67 (comment) would it be possible to make bufio.Scanner a config option and print out a clear log message if the size is too small ? I have increased the size again to look backwards more than 75 days.
@ihard where have you seen this error message ? I haven't seen it. Have to find out the hard way.
No idea what the right fixed size could be. Maybe someone would like to look backwards some years.
So make it a config option would be nice if it's possible. I'm really not a developer so I can't do it by myself
.
Ralph

current master broken!?

Hi,

right now the current version from master does not work at all for me. Looking at the log it seems that the query in graphite_tree has an extra . at the end which should not be there.


[2017-07-12 12:58:41] I clickhouse.go:71: query {"request_id":"2","query":"SELECT Path FROM graphite_tree WHERE (Level = 5) AND (Path = 'carbon.agents.graphitedev002.pickle.metricsReceived' OR Path = 'carbon.agents.graphitedev002.pickle.metricsReceived.') GROUP BY Path HAVING argMax(Deleted, Version)==0","runtime_ns":6432321,"runtime":"6.432321ms"}
[2017-07-12 12:58:41] I graphite-clickhouse.go:74: access {"request_id":"2","runtime":"6.548465ms","runtime_ns":6548465,"method":"GET","url":"/metrics/find/?local=1&format=pickle&query=carbon.agents.graphitedev002.pickle.metricsReceived&from=1499770721&until=1499857121","peer":"127.0.0.1:48454","status":200}
[2017-07-12 12:58:41] I clickhouse.go:71: query {"request_id":"6","query":"SELECT Path FROM graphite_tree WHERE (Level = 5) AND (Path = 'carbon.agents.graphitedev002.pickle.metricsReceived' OR Path = 'carbon.agents.graphitedev002.pickle.metricsReceived.') GROUP BY Path HAVING argMax(Deleted, Version)==0","runtime_ns":4658751,"runtime":"4.658751ms"}
[2017-07-12 12:58:41] I clickhouse.go:71: query {"request_id":"6","query":" SELECT Path, Time, Value, Timestamp FROM graphite WHERE (Path IN ('carbon.agents.graphitedev002.pickle.metricsReceived')) AND ((Date >='2017-07-11' AND Date <= '2017-07-12' AND Time >= 1499770721 AND Time <= 1499857139)) FORMAT RowBinary ","runtime_ns":11681202,"runtime":"11.681202ms"}
[2017-07-12 12:58:41] I graphite-clickhouse.go:74: access {"request_id":"6","runtime":"19.191811ms","runtime_ns":19191811,"method":"GET","url":"/render/?format=pickle&local=1&noCache=1&from=1499770721&until=1499857121&target=carbon.agents.graphitedev002.pickle.metricsReceived&now=1499857121","peer":"127.0.0.1:48454","status":200}

I've bisected it down to

117efa3b9d07125d57c3bc6f13f0d2ab751597bd is the first bad commit
commit 117efa3b9d07125d57c3bc6f13f0d2ab751597bd
Author: Roman Lomonosov <[email protected]>
Date:   Mon May 15 21:32:19 2017 +0300

    update zap

:100644 100644 f9fc933eb6a2a06169461103f3c1b53ceb980699 7bd0e607ddec34a4c97039b72eb8d2b0e069fbda M	.gitmodules
:040000 040000 54d57556519c0c07f3aa0ca107929341d9e6e375 37e986474ae52d2c816b5640b7a7a4f24200bd47 M	config
:100644 100644 3e5506c35e38055699a0f7ba69931caebcae90af cd4d14d048048e943ba3e6e3b469999349e07a0c M	graphite-clickhouse.go
:040000 040000 e55d6aed02ce35347137a5c8fec2321c56cfed98 2196e146f095e058191e10134267c9913d188218 M	helper
:040000 040000 52be63571bafd8ae7b25bcc29433defcc71c7fa5 1f74a7862fcb8534276b1f93aaaeaafd1c513f89 M	render
:040000 040000 3b6a0ce99138ea284590349d0aa33f66d8017ef4 83b6cabd5f099cd38d9f514c69bb3cad52605254 M	tagger
:040000 040000 52ad2c251dd4e077bfb886dd5ecf107af53c75b5 af4377c27303521a02d6d4cc2a4a859bbbee958f M	vendor

Right now I don't have the time to debug this further, but maybe later.

Не применяется rollup-conf = "auto" при использовании Distributed таблиц

graphite-clickhouse -version
0.11.7

cat /etc/clickhouse-server/config.d/rollup.xml
<yandex>
<graphite_rollup>
<default>
  <function>avg</function>
    <retention>
      <age>0</age>
      <precision>10</precision>
    </retention>
    <retention>
      <age>259200</age>
      <precision>30</precision>
    </retention>
    <retention>
      <age>1209600</age>
      <precision>300</precision>
    </retention>
    <retention>
      <age>2419200</age>
      <precision>900</precision>
    </retention>
    <retention>
      <age>29030400</age>
      <precision>3600</precision>
    </retention>
</default>
</graphite_rollup>
</yandex>

Распределенная таблица

SHOW CREATE TABLE graphite_reverse

┌─statement──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ CREATE TABLE default.graphite_reverse (`Path` String, `Value` Float64, `Time` UInt32, `Date` Date, `Timestamp` UInt32) ENGINE = Distributed(shardOne, shardOne, graphite_reverse, sipHash64(Path)) │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Таблица с данными

SHOW CREATE TABLE shardOne.graphite_reverse

┌─statement───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ CREATE TABLE shardOne.graphite_reverse (`Path` String, `Value` Float64, `Time` UInt32, `Date` Date, `Timestamp` UInt32) ENGINE = GraphiteMergeTree(Date, (Path, Time), 8192, 'graphite_rollup') │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

правила Retentions из БД

SELECT *
FROM system.graphite_retentions

┌─config_name─────┬─regexp─┬─function─┬──────age─┬─precision─┬─priority─┬─is_default─┬─Tables.database─┬─Tables.table─────────┐
│ graphite_rollup │        │ avg      │ 29030400 │      3600 │    65535 │          1 │ ['shardOne']    │ ['graphite_reverse'] │
│ graphite_rollup │        │ avg      │  2419200 │       900 │    65535 │          1 │ ['shardOne']    │ ['graphite_reverse'] │
│ graphite_rollup │        │ avg      │  1209600 │       300 │    65535 │          1 │ ['shardOne']    │ ['graphite_reverse'] │
│ graphite_rollup │        │ avg      │   259200 │        30 │    65535 │          1 │ ['shardOne']    │ ['graphite_reverse'] │
│ graphite_rollup │        │ avg      │        0 │        10 │    65535 │          1 │ ['shardOne']    │ ['graphite_reverse'] │
└─────────────────┴────────┴──────────┴──────────┴───────────┴──────────┴────────────┴─────────────────┴──────────────────────┘

Запрос, который делает graphite-clickhouse, чтобы эти правила получить
SELECT regexp, function, age, precision, is_default FROM system.graphite_retentions ARRAY JOIN Tables AS table WHERE (table.database = 'default') AND (table.table = 'graphite_reverse') ORDER BY is_default ASC, priority ASC, regexp ASC, age ASC

Конфиг

cat /etc/graphite-clickhouse/graphite-clickhouse.conf
[common]
listen = ":9090"
max-cpu = 8

[clickhouse]
url = "http://localhost:8123/?max_query_size=2097152&readonly=2"
data-table = ""
index-table = "graphite_index"
rollup-conf = "auto"
data-timeout = "1m0s"
index-timeout = "1m0s"
tagged-table = "graphite_tagged"

[[data-table]]
table = "graphite_reverse"
reverse = true
rollup-conf = "auto"

[[logging]]
logger = ""
file = "/var/log/graphite-clickhouse/graphite-clickhouse.log"
level = "info"
encoding = "mixed"
encoding-time = "iso8601"
encoding-duration = "seconds"

При настройке rollup-conf = "auto" точки все равно выбирались раз в одну минуту, то есть с дефолтными настройками retention. Происходит это из-за того, что выборка правил происходит по той таблице, которая настроена в конфиге graphite-clickhouse, но если она распределенная, то её упоминания с system.retentions не будет

Grafana show only "Missing series in result" of graphite-clickhouse

I have next docker-compose file:

clickhouse:
  image: yandex/clickhouse-server:19.6.2.11
  volumes:
  - "./rollup.xml:/etc/clickhouse-server/config.d/rollup.xml"
  - "./init.sql:/docker-entrypoint-initdb.d/init.sql"
  - "./data/clickhouse/data:/var/lib/clickhouse/data"
  - "./data/clickhouse/metadata:/var/lib/clickhouse/metadata"
carbon-clickhouse:
  image: lomik/carbon-clickhouse:v0.10.2
  volumes:
  - "./data/carbon-clickhouse:/data/carbon-clickhouse"
  - "./carbon-clickhouse.conf:/etc/carbon-clickhouse/carbon-clickhouse.conf"
  ports:
  - "2003:2003" # plain tcp
  - "2003:2003/udp" # plain udp
  - "2004:2004" # pickle
  - "2006:2006" # prometheus remote write
  links:
  - clickhouse
graphite-clickhouse:
  image: lomik/graphite-clickhouse:v0.11.1
  volumes:
  - "./rollup.xml:/etc/graphite-clickhouse/rollup.xml"
  - "./graphite-clickhouse.conf:/etc/graphite-clickhouse/graphite-clickhouse.conf"
  links:
  - clickhouse
grafana:
  image: grafana/grafana
  restart: always
  ports:
    - 3000:3000
  links:
    - graphite-clickhouse

I get error message on adding graphite data source:
Screenshot 2020-01-02 at 03 01 33

And also on data exploring:
Screenshot 2020-01-02 at 03 02 51

I also put data in graphite via nc:
echo "local.random.diceroll 4 date +%s" | nc 127.0.0.1 2003

Graphite-web seems works fine. Is there any way to get the data in Grafana from graphite-clickhouse?

date-tree-table-version gets broken in 0.7

We use the following table for date-tree
graphite.date_metrics ( Path String, Level UInt32, Date Date ) ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{shard}/graphite.date_metrics', '{replica}', Date, (Level, Path, Date), 8192) AS SELECT toUInt32(length(splitByChar('.', Path))) AS Level, Date, Path FROM graphite.data;

Starting from graphite-clickhouse-0.7 it tries to query with Deleted field even when date-tree-table-version is being set to 1.

find return wrong content type

On large results /metric/find set incorrect content type in header

Content-Type: text/plain; charset=utf-8
Transfer-Encoding: chunked

instead of

Content-Type: application/octet-stream

aliasByTags not working correctly

Hi, it seems that this backend does not support tag queries very well when there is also some computation.

The following query displays the series as perSecond(interface_in_octets and perSecond(interface_out_octets:

aliasByTags(perSecond(seriesByTag('hostname=$host', 'ifName=~${interface:regex}', 'name=~interface_(in|out)_octets')), 'name')

While the following query works correct (but does not show correct data):

aliasByTags(seriesByTag('hostname=$host', 'ifName=~${interface:regex}', 'name=~interface_(in|out)_octets'), 'name')

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.