Giter VIP home page Giter VIP logo

grok_exporter's Introduction

Build Status Build status Coverage Status

grok_exporter

Export Prometheus metrics from arbitrary unstructured log data.

About Grok

Grok is a tool to parse crappy unstructured log data into something structured and queryable. Grok is heavily used in Logstash to provide log data as input for ElasticSearch.

Grok ships with about 120 predefined patterns for syslog logs, apache and other webserver logs, mysql logs, etc. It is easy to extend Grok with custom patterns.

The grok_exporter aims at porting Grok from the ELK stack to Prometheus monitoring. The goal is to use Grok patterns for extracting Prometheus metrics from arbitrary log files.

How to run the example

Download grok_exporter-$ARCH.zip for your operating system from the releases page, extract the archive, cd grok_exporter-$ARCH, then run

./grok_exporter -config ./example/config.yml

The example log file exim-rejected-RCPT-examples.log contains log messages from the Exim mail server. The configuration in config.yml counts the total number of rejected recipients, partitioned by error message.

The exporter provides the metrics on http://localhost:9144/metrics:

screenshot.png

Configuration

Example configuration:

global:
  config_version: 3
input:
  type: file
  path: ./example/example.log
  readall: true
imports:
- type: grok_patterns
  dir: ./logstash-patterns-core/patterns
metrics:
- type: counter
  name: grok_example_lines_total
  help: Counter metric example with labels.
  match: '%{DATE} %{TIME} %{USER:user} %{NUMBER}'
  labels:
    user: '{{.user}}'
    logfile: '{{base .logfile}}'
server:
  port: 9144

CONFIG.md describes the grok_exporter configuration file and shows how to define Grok patterns, Prometheus metrics, and labels. It also details how to configure file, stdin, and webhook inputs.

How to build from source

In order to compile grok_exporter from source, you need

  • Go installed and $GOPATH set.
  • gcc installed for cgo. On Ubuntu, use apt-get install build-essential.
  • Header files for the Oniguruma regular expression library, see below.

Installing the Oniguruma library on OS X

brew install oniguruma

Installing the Oniguruma library on Ubuntu Linux

sudo apt-get install libonig-dev

Installing the Oniguruma library from source

curl -sLO https://github.com/kkos/oniguruma/releases/download/v6.9.5_rev1/onig-6.9.5-rev1.tar.gz
tar xfz onig-6.9.5-rev1.tar.gz
cd onig-6.9.5
./configure
make
make install

Installing grok_exporter

git clone https://github.com/fstab/grok_exporter
cd grok_exporter
git submodule update --init --recursive
go install .

The resulting grok_exporter binary will be dynamically linked to the Oniguruma library, i.e. it needs the Oniguruma library to run. The releases are statically linked with Oniguruma, i.e. the releases don't require Oniguruma as a run-time dependency. The releases are built with hack/release.sh.

Note: Go 1.13 for Mac OS has a bug affecting the file input. It is recommended to use Go 1.12 on Mac OS until the bug is fixed. Go 1.13.5 is affected. golang/go#35767.

More Documentation

User documentation is included in the GitHub repository:

  • CONFIG.md: Specification of the config file.
  • BUILTIN.md: Definition of metrics provided out-of-the-box.

Developer notes are available on the GitHub Wiki pages:

External documentation:

Contact

  • For feature requests, bugs reports, etc: Please open a GitHub issue.
  • For bug fixes, contributions, etc: Create a pull request.
  • Questions? Contact me at [email protected].

Related Projects

Google's mtail goes in a similar direction. It uses its own pattern definition language, so it will not work out-of-the-box with existing Grok patterns. However, mtail's RE2 regular expressions are probably more CPU efficient than Grok's Oniguruma patterns. mtail reads logfiles using the fsnotify library, which might be an obstacle on operating systems other than Linux.

License

Licensed under the Apache License, Version 2.0. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0.

grok_exporter's People

Contributors

asomers avatar dcwangmit01 avatar flachesis avatar fstab avatar geekdave avatar gucce avatar hartfordfive avatar janisz avatar jenserat avatar maxkochubey avatar rhuss avatar roidelapluie avatar skeen avatar virtuald avatar whoawhoawhoa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

grok_exporter's Issues

Values not updating.

Hi, I'm trying to get grok_exporter to parse a log file created by a node.js app and then send that data to Prometheus. I'm most probably "holding it wrong" because the matched values aren't update when new lines are written to the log file. When I restart the grok_exporter, then it sends the last read match to Prometheus and doesn't update with new values.

Example of log lines:

2017-09-14 14:52 +00:00: The proxy currently has 9 miners connected at 5638 h/s with an average diff of 18800

I've got something along these lines in the config file (so far):

global:
    config_version: 2
input:
    type: stdin
grok:
    patterns_dir: ./patterns
metrics:
    - type: gauge
      name: proxy_miners_count
      help: Number of miners connected to proxy.
      match: '%{TIMESTAMP_ISO8601:date} %{ISO8601_TIMEZONE}: The proxy currently has %{NUMBER:miners:int} miners connected at %{NUMBER:hash:int} h/s with an average diff of %{NUMBER:diff:int}'
      value: '{{.miners}}'

    - type: gauge
      name: proxy_hash_count
      help: Total H/s of miners connected to proxy.
      match: '%{TIMESTAMP_ISO8601:date} %{ISO8601_TIMEZONE}: The proxy currently has %{NUMBER:miners:int} miners connected at %{NUMBER:hash:int} h/s with an average diff of %{NUMBER:diff:int}'
      value: '{{.hash}}'

Is it an error on my part or is this something not intended for grok_exporter? The log file gets update every 1-2 seconds so I'd expect to see this data reflected on Prometheus.

Cheers and thanks for any help
Pedro

In windows grok_exporter metrics not getting updated to browser.

I have tried to access metrics from the apache logs using grok_exporter in windows 10 environment. I getting following issuses/errors.

  1. "Error reading log lines: failed to watch C:\Apache24\logs\access_log: read error: read C:\Apache24\logs\access_log: The process cannot access the file because another process has locked a portion of the file" After sometime.

  2. Hitting some load to apache web server, But "http://localhost:9144/metrics" i am getting old metrics not updated metrics(i have tried by making readfile true/false both options).
    i have took "exe file" from this link.

  3. When i am running "tail -f /var/log/access_log" in terminal and grok_exporter in other terminal real time metrics getting in browser, But after some time ending with above mentioned error(1).

Reading multiple files ?

Hi,
thanks for grok_exporter, looks really useful! I was wondering if there's support for reading multiple files within the same config/server ?

This exporter does not do tailing over the file?

Hi,
My case is: I have the nginx access log file. I want the exporter can do tailing over the file. If the new lines in the file matches the pattern, then count it. And I want the have the expiration time(i.e. 5 mimutes) for the current counter.
Looks like this exporter can not do neither ?

Bests,
Autumn Wang

Unable to run darwin binary.

Whenever I am running the command "./grok_exporter -config ./example/config.yml", I am getting following error.

SIGILL: illegal instruction
PC=0x438d4fa m=8 sigcode=1
signal arrived during cgo execution

goroutine 1 [syscall, locked to thread]:
runtime.cgocall(0x4383910, 0xc420051600, 0x4485c19)
/usr/local/Cellar/go/1.9.2/libexec/src/runtime/cgocall.go:132 +0xe4 fp=0xc4200515c0 sp=0xc420051580 pc=0x4003614
github.com/fstab/grok_exporter/exporter._Cfunc_onig_new(0xc4204d45e0, 0x6000000, 0x600076e, 0x0, 0x46ed318, 0x46ed1b8, 0xc4204d6cc0, 0x0)
github.com/fstab/grok_exporter/exporter/_obj/_cgo_gotypes.go:294 +0x4d fp=0xc420051600 sp=0xc4200515c0 pc=0x437bbcd
github.com/fstab/grok_exporter/exporter.(*OnigurumaLib).Compile.func1(0xc4204d45e0, 0x6000000, 0x600076e, 0x0, 0x46ed318, 0x46ed1b8, 0xc4204d6cc0, 0x53)
/Users/fabian/go/src/github.com/fstab/grok_exporter/exporter/oniguruma.go:77 +0x19f fp=0xc420051660 sp=0xc420051600 pc=0x437e1ef
github.com/fstab/grok_exporter/exporter.(*OnigurumaLib).Compile(0xc4204fa0a8, 0xc42017f800, 0x76e, 0x0, 0x0, 0x0)
/Users/fabian/go/src/github.com/fstab/grok_exporter/exporter/oniguruma.go:77 +0x179 fp=0xc4200516f0 sp=0xc420051660 pc=0x437c4a9
github.com/fstab/grok_exporter/exporter.Compile(0xc420168000, 0x6d, 0xc42011a050, 0xc4204fa0a8, 0x0, 0x1, 0x4042fb4)
/Users/fabian/go/src/github.com/fstab/grok_exporter/exporter/grok.go:31 +0x9c fp=0xc4200517a8 sp=0xc4200516f0 pc=0x4372bfc
main.createMetrics(0xc420158000, 0xc42011a050, 0xc4204fa0a8, 0x0, 0x0, 0x0, 0x0, 0xc420064970)
/Users/fabian/go/src/github.com/fstab/grok_exporter/grok_exporter.go:178 +0x15c fp=0xc420051b10 sp=0xc4200517a8 pc=0x43819ec
main.main()
/Users/fabian/go/src/github.com/fstab/grok_exporter/grok_exporter.go:62 +0x14d fp=0xc420051f80 sp=0xc420051b10 pc=0x438000d
runtime.main()
/usr/local/Cellar/go/1.9.2/libexec/src/runtime/proc.go:195 +0x226 fp=0xc420051fe0 sp=0xc420051f80 pc=0x402da76
runtime.goexit()
/usr/local/Cellar/go/1.9.2/libexec/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc420051fe8 sp=0xc420051fe0 pc=0x4059551

rax 0x0
rbx 0x1
rcx 0x5
rdx 0x700000392be4
rdi 0x46ed318
rsi 0x1
rbp 0x700000392d90
rsp 0x700000392aa0
r8 0x0
r9 0x0
r10 0x467bfb0
r11 0x6ffffbd165fc
r12 0x700000392bb0
r13 0x1
r14 0xc4204d6cc0
r15 0x4d00020
rip 0x438d4fa
rflags 0x10297
cs 0x2b
fs 0x0
gs 0x0

kindly look into this issue.

run as a Windows service

Could you add a feature to run grok_exporter as a Windows service? I have tried to register it with

sc.exe create grok_exporter binPath= "c:\grok_exporter\grok_exporter.exe -config c:\grok_exporter\config.yml"

but the service failed to start with followed error:

The grok_exporter service failed to start due to the following error: 
The service did not respond to the start or control request in a timely fashion.

Grok - fields with ( ) in them.

Hey,

Here is my Grok query but for some reason it cannot find a match when I have the brackets in the Referrer and user agent name.

%{TIMESTAMP_ISO8601:logtime} %{WORD:s-sitename} %{WORD:s-computername} %{IPORHOST:s-ip} %{WORD:cs-method} %{NOTSPACE:cs-uri-stem} %{NOTSPACE:cs-uri-query} %{NUMBER:s-port} %{NOTSPACE:cs-username} %{IPORHOST:c-ip} %{NOTSPACE:cs-version} %{NOTSPACE:cs(User-Agent)} %{NOTSPACE:cs(Referer)} %{IPORHOST:cs-host} %{NUMBER:sc-status} %{NUMBER:sc-substatus} %{NUMBER:c-win32-status} %{NUMBER:sc-bytes} %{NUMBER:cs-bytes} %{NUMBER:time-taken}

Example log item:

2018-02-02 00:01:32 W3SVC1 UKAPPSVR 172.18.131.173 GET /123/I/Home/PLMonstants - 80 Joe+Bloggs 172.18.17.185 HTTP/1.1 Mozilla/5.0+(Windows+NT+6.1;+Trident/7.0;+rv:11.0)+like+Gecko https://blahblah.co.uk/theappname/live/app/thingy localhost 200 0 0 3393 2644 90

was using http://grokconstructor.appspot.com/do/match to validate?

Any ideas what I could be doing wrong or if there is something I can change with the query string to work around the bracket issue.

Unfortunately I cannot change the name of the field as we push into splunk as well.

Thanks.

Pete

Help with gork_exporter

Hello

I'm using grok exporter and here is what I want to achieve: I have a Java application whose log entry is in below format:

{"@Version":1,"source_host":"fstest-stage-bm-62","message":"Known host file not configured, using user known host file: /home/.ssh/known_hosts","thread_name":"Camel (camel-1) thread #4 - aws-s3://fstest-stage-bm-62","@timestamp":"2019-08-28T07:52:12.526+00:00","level":"INFO","logger_name":"org.apache.cam.file.remote.oerations"}

I want to configure Prometheus alert for any 'ERROR' entry in the log level. Here is how the grok_exporter config.yml file look like:

global:
config_version: 2
input:
type: file
path: ./example/test.log
readall: true # Read from the beginning of the file? False means we start at the end of the file and read only new lines.
grok:
patterns_dir: ./patterns

metrics:
- type: counter
name: error_test
help: Counter metric example
match: '%{NUMBER} %{JAVACLASS} %{JAVALOGMESSAGE} %{JAVATHREAD} %{TOMCAT_DATESTAMP} %{LOGLEVEL:severity} %{JAVAMETHOD}'
labels:
grok_field_name: severity
prometheus_label: severity
server:
host: 0.0.0.0
port: 9144

============================
The test log file has 4 log lines with one log line having log level as ERROR. I did try accessing http://IP:9144/metrics and I see the below but there is no metric created on Prometheus(grok_exporter is installed on Prometheus itself).

grok_exporter_line_processing_errors_total{metric="error_test"} 0

HELP grok_exporter_lines_matching_total Number of lines matched for each metric. Note that one line can be matched by multiple metrics.

TYPE grok_exporter_lines_matching_total counter

grok_exporter_lines_matching_total{metric="error_test"} 0

HELP grok_exporter_lines_processing_time_microseconds_total Processing time in microseconds for each metric. Divide by grok_exporter_lines_matching_total to get the average processing time for one log line.

TYPE grok_exporter_lines_processing_time_microseconds_total counter

grok_exporter_lines_processing_time_microseconds_total{metric="error_test"} 0

HELP grok_exporter_lines_total Total number of log lines processed by grok_exporter.

TYPE grok_exporter_lines_total counter

grok_exporter_lines_total{status="ignored"} 4
grok_exporter_lines_total{status="matched"} 0

I do see the metric on prometheus but ti doesn't yield any value. Can someone please help me with regex expression for my json log format as I couldn't get the correct matching format.

Thanks

conditional labels

Hi,

Is it possible to have "conditional" labels ? For example, consider the following patterns :

FOO foo
BAR bar
FOOBAR %{FOO:foo}%{BAR:bar}?

If I match on %{FOOBAR} in my metrics, I can use {{.foo}} in the labels, because it will always be present. However, as soon as I want to use {{.bar}}, I run into problems because a line can match only %{FOO} and not %{BAR}, so sometimes bar isn't in the template context, so grok_exporter becomes unhappy : Warning : Skipping log line: foobar: unexpected result when calling onig_name_to_group_numbers()

I tried to work around it by using {{if .bar}}{{.bar}}{{end}} to no avail.

How could I achieve what I want there ? Is there a way to check that a template variable exists without making grok_exporter bail on onig_name_to_group_numbers() ?

Thanks

Can i start with multiple config files

Can i start a single grok-exporter with multiple config files?
Eg.
./grok_exporter -config file1 file2 file3 file4
or
./grok_exporter -config file1 -config file2 -config file3 -config file4

The reason i ask is i have 4 static files in which i want to monitor the output of a string.
2 files i expect same string to exist and 2 files with 2 other strings
these are the strings
*=INFO<string>=DETAIL:<string>=DETAIL
*=info
*=INFO

I created 4 files to monitor each looking pretty much like

global:
    config_version: 2
input:
    type: file
    path: <path>/server.xml
    readall: true # Read from the beginning of the file? False means we start at the end of the file and read only new lines.
grok:
    patterns_dir: ./patterns
    additional_patterns:
    - 'EXIM_MESSAGE [a-zA-Z ]*'
metrics:
    - type: counter
      name: uview_server_xml_trace_level
      help: traceSpecification in the uview server.xml
      match:  'traceSpecification=%{QUOTEDSTRING:LOGLEVEL}'
      labels:
        LOGLEVEL: '{{.LOGLEVEL}}'

Multifile seems only to read the first file.

grok_exporter_build_info{branch="master",builddate="2019-04-08",goversion="go1.12.2",platform="linux-amd64",revision="e2ba841",version="0.2.7"} 1

How to set up ALERT rule of prometheus. With gauge, final groked value is repeatedly reported forever even the new line will never occurs.

I'm newbie for prometheus and exporters. Thanks for your great work. I believe this exporter is fairly useful for monitoring guys.

I have a question about usage of this exporter.

Situation: Now I'm trying to use this exporter to retrive metrics of CPU, memory and disk usage from some log messages of a certain OSS**, but it seems grok_exporter returns same values repeatedly even after receiving logs completely stopped. After stopping target application (and sending logs), the exporter continues to shows me "CPU usage: x % (and seems running now)" forever.
**Should I use any kind of API returns some metrics in real time? Yes, I should. Unfortunately, I cannot get nice metrics from its API, need to use logs to moniter them. :-(

Problem: I cannot distinguish this metrics is dead or really keeping same value. And it's difficult to make some kind of queries of prometheus, because useless values remains, avg() and similar functions returns wrong results.

Question: Is there any way to make the exporter to forget very old values with interval we want? Or is there other good practice to resolve this situation?

I tried:

cpu_percentage{app_id="<UUID>",app_name="<NAME>", app_container_index="<NUM>",instance="grok_exporter:9144",job="grok"}[1m]
5.00000000 @1493027556.79
5.00000000 @1493027571.79
5.00000000 @1493027586.79
5.00000000 @1493027601.79
...repeats forever

I don't know very nice solution but guessing:

cpu_percentage{app_id="<UUID>",app_name="<NAME>", app_container_index="<NUM>",instance="grok_exporter:9144",job="grok"}[1m]
5.00000000 @1493027556.79
5.00000000 @1493027571.79
(Any more metric values are returned after passing expire interval, until receiving new logs matching same grok expression.)
...
X.XXXXXXXX @1493030000.79 (if new matched log message appears.)

Note: Use cumulative:true and idelta() in prometheus function might be an unperfect workaround. idelta() returns 0 in the previous situation, however, I couldn't recognize the application [container] is dead or keeping 0 value. For example, query like avg(idelta(cpu_percentage_cumulative{app_name="<NAME>", app_container_index=".+"}[60s])) returns wrong lower values after we scale in (shurink) this application. Shrinked container's values are keeping 0.

Unable to create release .zip

Unable to find the dist/ dir after running the release script. Logs below

some_dir/gowork/src/github.com/grok_exporter > git log | head
commit 27d47f313b43a9f0981c83a9a742fe716b6f84ee
Author: Fabian Stäber <[email protected]>
Date:   Sun May 5 22:05:48 2019 +0200

    #5 add file tailer test for multiple log files

commit f6cf48430d31d72d88526efd4339200567c5b348
Author: Fabian Stäber <[email protected]>
Date:   Wed May 1 22:45:58 2019 +0200

some_dir/gowork/src/github.com/grok_exporter > ./release.sh  linux-amd64
?       github.com/fstab/grok_exporter  [no test files]
ok      github.com/fstab/grok_exporter/config   (cached)
ok      github.com/fstab/grok_exporter/config/v1        (cached)
ok      github.com/fstab/grok_exporter/config/v2        (cached)
ok      github.com/fstab/grok_exporter/exporter (cached)
ok      github.com/fstab/grok_exporter/oniguruma        (cached)
ok      github.com/fstab/grok_exporter/tailer   (cached)
?       github.com/fstab/grok_exporter/tailer/fswatcher [no test files]
ok      github.com/fstab/grok_exporter/tailer/glob      (cached)
ok      github.com/fstab/grok_exporter/template (cached)
Building dist/grok_exporter-0.2.8-SNAPSHOT.linux-amd64.zip
go version: go version go1.12.2 linux/amd64
some_dir/gowork/src/github.com/grok_exporter > ll
total 288K
-rw-r--r-- 1 raokru warp   32 May  8 14:49 AUTHORS
-rw-r--r-- 1 raokru warp 3.5K May  8 14:49 BUILTIN.md
drwxr-xr-x 4 raokru warp 4.0K May  8 14:49 config/
-rw-r--r-- 1 raokru warp  22K May  8 14:49 CONFIG.md
-rw-r--r-- 1 raokru warp  14K May  8 14:49 CONFIG_v1.md
drwxr-xr-x 2 raokru warp 4.0K May  8 14:49 example/
drwxr-xr-x 2 raokru warp 4.0K May  8 14:49 exporter/
-rw-r--r-- 1 raokru warp  550 May  8 14:49 go.mod
-rw-r--r-- 1 raokru warp  11K May  8 14:49 grok_exporter.go
-rw-r--r-- 1 raokru warp 2.1K May  8 14:49 HOWTO_VERIFY_RELEASES.md
-rwxr-xr-x 1 raokru warp  12K May  8 14:49 integration-test.sh*
-rw-r--r-- 1 raokru warp  10K May  8 14:49 LICENSE
drwxr-xr-x 2 raokru warp 4.0K May  8 14:49 logstash-patterns-core/
-rw-r--r-- 1 raokru warp  673 May  8 14:49 NOTICE
drwxr-xr-x 2 raokru warp 4.0K May  8 14:49 oniguruma/
-rw-r--r-- 1 raokru warp 8.1K May  8 14:49 README.md
-rwxr-xr-x 1 raokru warp 7.4K May  8 14:49 release.sh*
-rw-r--r-- 1 raokru warp 141K May  8 14:49 screenshot.png
drwxr-xr-x 4 raokru warp 4.0K May  8 14:49 tailer/
drwxr-xr-x 2 raokru warp 4.0K May  8 14:49 template/
some_dir/gowork/src/github.com/grok_exporter > 

support newer version of Oniguruma

Hello, I am a packager for Gentoo linux.

I have been asked to look into packaging this and making it available to our users.

I see from your README.md that that you require oniguruma 5.9.6; however, that version is not available in Gentoo. We have versions 6.8.2, 6.9.0 and 6.9.1.

What are your plans for updating this? Do you have any idea when you will support a newer version?

Thanks much.

Input filenames with timestamp

Hello,
I've seen that support for multiple files is one of the most requested features, but would it be possible to use a file that has a datestamp in the filename? I'm monitoring an application that appends the current date and time to its log files. I know that wildcards that are being considered as part of multiple file support would 'solve' this issue, but it would potentially mean reading through dozens of log files when all I'm really interested in is the most recent one.

Thanks.

Example filename structure:
image

to lowercase(or uppercase) is not available with the gsub command

The following config will result in an error.

labels:
  label: '{{gsub .label "[[:lower:]]" "\\U\\0"}}'
Failed to load /etc/grok_exporter/grok.yml: invalid configuration: failed to read metric label test_metric error parsing label template: syntax error in gsub call: '\U\0' is not a valid replacement: invalid escape sequence: don't forget to put a . (dot) in front of grok fields, otherwise it will be interpreted as a function.

Allow mutations on data

TL;DR: Create new fields from other fields (or replace values of existing ones) via regex or other builtin fonctions, just like logstash's mutate plugin.

Context: I've just dicsovered this tool (and mtail) after trying to perform tail+parse+count data processing in existing PHP application (disclaimer: it failed).

grok_exporter seems great but there is one feature I would miss: data mutations.
The ability to alter fields before exporting to Prometheus (just like logstash's mutate plugin) would be awesome.

In my use case I am reading Apache access.log file and I want to export HTTP requests count with the following dimensions/labels:

  • Straightforward (simple grok field to Prometheus label):
    • Status code
    • HTTP verb
    • Response size
  • Add fixed labels: possible since #8
  • Field computation required:
    • Only keep the base URL (scheme+FQDN) from referrer (eg. http://example.com/foo.asp?id=42 => http://example.com): some regex would do (or I could adapt the line matching regex)
    • Extract query string part from the referrer and add some labels from it. For example, from http://example.com/foo.asp?id=42&source=github&foo=bar I want the following fields (thus labels): id and foo. I get that dropping labels is already something grok_exporter can do, so having a mutation that creates a label for every found query parameter is fine.

Other use cases (not mine):

  • Compute hash of string.
  • Anonymize values (eg. for the the auth HTTP field, replace any value different than - by connected_user and the - value by guest).
  • Replace raw values by meaningful values (eg. 404 => Not Found).

IIS Support

Hi there.

Sorry if this is the wrong place to ask but has anyone built support for IIS logs? Cheers

Pete

Unable to capture multiline pattern

Hello,

I'm trying to capture multiline pattern from the following text:

2018-10-08 06:55:35.156330 0x00007f3e3569c700: <info> (health::main.cpp@169) peer-node-0
         ACNTST C : 20'032 |  ACNTST C HVA : 22     |  ACTIVE PINGS : 0      |     B WRITERS : 0      |
     BLK ELEM ACT : 0      |  BLK ELEM TOT : 2'217  |      BLKDIF C : 665    |        HASH C : 13'013 |
       HASHLOCK C : 0      |   MEM CUR RSS : 65     |  MEM CUR VIRT : 774    |   MEM MAX RSS : 65     |
      MEM SHR RSS : 27     |      MOSAIC C : 1      |   MOSAIC C DS : 1      |          NS C : 1      |
          NS C AS : 1      |       NS C DS : 1      | RB COMMIT ALL : 0      | RB COMMIT RCT : 0      |
    RB IGNORE ALL : 0      | RB IGNORE RCT : 0      |       READERS : 3      |  SECRETLOCK C : 0      |
    SUCCESS PINGS : 0      |         TASKS : 11     |   TOTAL PINGS : 0      |   TS NODE AGE : 13     |
    TS OFFSET ABS : 0      | TS OFFSET DIR : 0      |  TS TOTAL REQ : 0      |   TX ELEM ACT : 0      |
      TX ELEM TOT : 3'081  |  UNLKED ACCTS : 1      |      UT CACHE : 0      |       WRITERS : 0      |

    2018-10-08 06:55:35.156661 0x00007f3e3569c700: <info> (health::main.cpp@169) peer-node-1
         ACNTST C : 20'032 |  ACNTST C HVA : 22     |  ACTIVE PINGS : 0      |     B WRITERS : 0      |
     BLK ELEM ACT : 0      |  BLK ELEM TOT : 2'236  |      BLKDIF C : 665    |        HASH C : 13'013 |
       HASHLOCK C : 0      |   MEM CUR RSS : 65     |  MEM CUR VIRT : 770    |   MEM MAX RSS : 66     |
      MEM SHR RSS : 27     |      MOSAIC C : 1      |   MOSAIC C DS : 1      |          NS C : 1      |
          NS C AS : 1      |       NS C DS : 1      | RB COMMIT ALL : 0      | RB COMMIT RCT : 0      |
    RB IGNORE ALL : 0      | RB IGNORE RCT : 0      |       READERS : 2      |  SECRETLOCK C : 0      |
    SUCCESS PINGS : 0      |         TASKS : 10     |   TOTAL PINGS : 0      |   TS NODE AGE : 13     |
    TS OFFSET ABS : 0      | TS OFFSET DIR : 0      |  TS TOTAL REQ : 0      |   TX ELEM ACT : 0      |
      TX ELEM TOT : 2'063  |      UT CACHE : 0      |       WRITERS : 1      |

I want:
Capture MEM CUR VIRT that is correlated to peer-node-0 (that would be the first occurrence of MEM CUR VIRT)

I wrote following pattern in config.yml:

global:
        config_version: 2
    input:
        type: file
        path: ./example/my_file.log
        readall: true # Read from the beginning of the file? False means we start at the end of the file and read only new lines.
    grok:
        patterns_dir: ./patterns
    metrics:
        - type: gauge
          name: peer_node_0_MEM_CUR_VIRT
          help: peer_node_0_MEM_CUR_VIRT
          match: '(?m)%{GREEDYDATA}peer-node-0%{GREEDYDATA}MEM%{SPACE}CUR%{SPACE}VIRT%{SPACE}:%{SPACE}%{INT:data}%{GREEDYDATA}peer-node-1%{GREEDYDATA}'
          value: '{{.data}}'
    server:
        port: 9144

What I get:
However, this doesn't work even if it works on: http://grokdebug.herokuapp.com/

If I write following match:

%{GREEDYDATA}MEM%{SPACE}CUR%{SPACE}VIRT%{SPACE}:%{SPACE}%{INT:data}%{GREEDYDATA}

it will work in Grok Exporter but I will not have correlation to specific peer-node (last occurrence will be taken)

Is there any other way to capture multi-line pattern in grok exporter?

should be runnable on localhost

I use the authguard to set up SSL and Authentication to prevent the misuse of sensible data from our apache logs. Please implement an option for running the exporter on localhost like many other exporters do.

grok_exporter couldn't keep up with the load

We are running grok_exporter in a docker container. The logstash collects and writes to the logfile that grok_exporter is picking up. Also logstash writes to another file that Splunk is picking up.

Splunk reports higher and we believe the correct number of transaction counts, however, grok_exporter seems like not able to keep up, only reports about 1/3 of the TPS. grok_exporter seems to catch up only at the early morning low traffic hours.

There is no out of memory and high CPU issue.

We have 1 counter, 2 gauges, 3 histograms, each with 11 labels. In the /metrics endpoint, it produces about 70k different time series.

How do you recommend to go about debugging this issue?

Build Info:

branch="master",builddate="2018-05-31",goversion="go1.10.2",platform="linux-amd64",revision="62d82f8",version="0.2.5"

Config Info:

global:
    config_version: 2
input:
    type: file
    path: /usr/share/logstash/output/log/apg_stats.log
    fail_on_missing_logfile: false
    readall: false
    poll_interval_seconds: 5

invalid UTF-8 label value - Missing codec in config file

Our log files are written in ASCII and can contain special chars like äöu in german speaking.
I searched for a possibility to set the coded in config.yml like in logstash config file:

codec => json { charset => "ASCII-8BIT" }

Is this possible to add to the config.yml?

Ignore regex support

Thanks for the library and your passion. Is there a way to skip matching some records by overriding a match rule. I keep getting lot of 404's from /phpMyAdmin/ and other bot attacks. I want to avoid creating timelines for them by ignoring them in the grok exporter.

Any way to achieve it the current grok exporter?

LICENSE.md

Please add the appropriate LICENSE.md file to the root of your project.

Great work.

Regex Capture Groups

Release: v0.2.5

I have the following log line I am trying to grok pattern match:

{"blah": "foo", "somethin": "baz", "status": 200}

My Grok Pattern (tested on https://regex101.com, and has captured value of 200):

STATUS "status":\s+([^",}]*)

My Grok Metric (Config v2):

match: '%{STATUS:status}'
labels:
    status: '{{.status}}'

I want .status to be just 200, but instead I am getting "status": 200

The grok patter in match seems to be using the whole match sequence in the pattern defintion rather than the capture group. Is this a bug or a feature?

Feature request: Retention does not work for metrics without labels.

It will be great if we can use retention for metrics values without labels.

Currently, retention is only supported with labels in metrics.

Our issue: While conversion of units of values in file, i.e. from KiB/s to MiB/s, I am observing two different values in metrics, which eventually display two different plots / lines in graph dashboard of Grafana.

example,
metrics:
- type: gauge
name: bandwidth_randwrite
help: FIO Bandwidth Random Write Gauge Metrics
match: ' write: IOPS=%{GREEDYDATA}, BW=%{NUMBER:val2}%{GREEDYDATA:kibs} %{GREEDYDATA}'
value: '{{if eq .kibs "KiB/s"}}{{divide .val2 1024}}{{else}}{{.val2}}{{end}}'
labels:
bw_unit: '{{.kibs}}'
cumulative: false
retention: 1m

Result Metrics:
bandwidth_randwrite{bw_unit="KiB/s"} 2.209
bandwidth_randwrite{bw_unit="MiB/s"} 3.205

grok_exporter v 0.2.6 resident memory rising

I have a fairly frequently updated log file that I am using grok exporter to count specific log entries. I see that the resident memory usage by the exporter is quite high and growing. This happens 3/4 of the time in my env.

Graph:
image

The config is:

global:
  config_version: 2
grok:
  additional_patterns:
  - BASE10NUM (?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+)))
  - DATA .*?
  - NUMBER (?:%{BASE10NUM})
  - DAEMON [a-zA-Z_\-]+
  - NAELOGLEVEL LOG_[\S+]
  - MESSAGE [\S\s\S]?*
  - YEAR (?>\d\d){1,2}
  - HOUR (?:2[0123]|[01]?[0-9])
  - MINUTE (?:[0-5][0-9])
  - SECOND (?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?)
  - TIME (?!<[0-9])%{HOUR}:%{MINUTE}(?::%{SECOND})(?![0-9])
  - MONTHNUM (?:0?[1-9]|1[0-2])
  - MONTHDAY (?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])
  - ISO8601_TIMEZONE (?:Z|[+-]%{HOUR}(?::?%{MINUTE}))
  - TIMESTAMP_ISO8601 %{YEAR}-%{MONTHNUM}-%{MONTHDAY}[T ]%{HOUR}:?%{MINUTE}(?::?%{SECOND})?%{ISO8601_TIMEZONE}?
  - LEVEL (LOG_)?([Aa]lert|ALERT|[Tt]race|TRACE|[Dd]ebug|DEBUG|DBG|[Nn]otice|NOTICE|[Ii]nfo|INFO|[Ww]arn?(?:ing)?|WARN?(?:ING)?|[Ee]rr|ERR?(?:OR)?|[Cc](rit)+(?:ical)?|CRIT?(?:ICAL)?|[Ff]atal|FATAL|[Ss]evere|SEVERE|EMERG(?:ENCY)?|[Ee]merg(?:ency)?)
input:
  path: /var/log/messages
  poll_interval_seconds: 5
  type: file
metrics:
- help: all log entries seperated by daemon
  labels:
    daemon: '{{.daemon}}'
  match: '%{TIMESTAMP_ISO8601:time} %{DAEMON:daemon}.*%{LEVEL:level}.*'
  name: benchmark_all_daemon_log_entry_counts
  type: counter
- help: count of panic entries in the log by go daemons
  labels:
    daemon: '{{.daemon}}'
  match: '%{TIMESTAMP_ISO8601:time} %{DAEMON:daemon}.*panic.*'
  name: benchmark_go_daemon_log_panic_counts
  type: counter

[other confidential counters]

Logrotate

Hi,

Cam we have inotify support, so that when a file is rotated we pick up the new one? thanks

How retention works?

In my grok configuration file, I enabled retention setting for gauge metrics.
retention: 5m
I use curl command to get the metrics.
Even after 10 minutes of metrics being generated, I can still see my self defined metrics in http://localhost:9144/metrics

Would some one help me out how retention setting works, why it's still exist in http://localhost:9144/metrics?
command I use:
curl http://localhost:9144/metrics|grep my_metrics | wc -l
...
546

should I use some parameter setting to get the correct result. I expect expired metrics should not be present in /metrics.

support countinf for multi-field combinations

I really like the grok_exporter - it works like a charm with little effort and I can reuse my grok patterns.

Is there any way to use multiple fields for counting? For example:
timestamp, field1, field2, field3 ...

I would like to be able to count (and group by) "field1.field3", instead of counting each field separately.

Only parse/calculate the last line/match?

Would it be possible to configure the grok section to only create metrics for the last matched line? I am trying to grok CSV files but each line already contains the aggregate result.

referencedGrokFields does not detect all referenced fields

When a grok field is referenced in the following way:

{{if eq .field "value"}}text{{end}}

The field is not detected by referencedGrokFields (in metrics.go).

Example config:

global:
    config_version: 2
input:
    type: file
    path: ./input
    readall: true
grok:
    patterns_dir: ./patterns
metrics:
    - type: gauge
      name: example
      help: not empty
      match: '%{NOTSPACE:val1} %{NOTSPACE:val2}'
      value: '{{.val1}}'
      labels:
          my_label: '{{if eq .val2 "test"}}yes{{else}}no{{end}}'

With input file:

1 test
2 nomatch

This results in the following error:

WARNING: Skipping log line: unexpected error while evaluating my_label template: template: my_label:1:5: executing "my_label" at <eq .val2 "test">: error calling eq: invalid type for comparison

If we change the definition of my_label to:

my_label: '{{.val2}} {{if eq .val2 "test"}}yes{{else}}no{{end}}'

The metric works as expected:

# HELP example not empty
# TYPE example gauge
example{my_label="nomatch no"} 2
example{my_label="test yes"} 1

duplicated metrics with node_exporter

I see grok-exporter output some go_mem* metrics by default. But all these metrics are already available with node-exporter.
One important point is that some metrics display different help string between grok_exporter and node-exporter. This will get Prometheus to complain the difference.
e.g:
node-exporter: # HELP go_memstats_sys_bytes Number of bytes obtained from system.
grok-exporter: # HELP go_memstats_sys_bytes Number of bytes obtained by system. Sum of all system allocations.
Would you please remove duplicated metrics with node exporter or update the help text string?

How to hide default labels in grok

Hello All,

I want to hide default labels in Grok Exporter output.

My Output is:


WLSRESTARTFLAG{exported_instance="https://171.17.22.7:2002/console/login/LoginForm.jsp",instance="muclhp522:9968",job="grok"} 0
WLSRESTARTFLAG{exported_instance="https://171.17.22.6:3011/weblogicStatus/index.html",instance="muclhp522:9968",job="grok"} 1
WLSRESTARTFLAG{exported_instance="https://171.17.22.5:3011/weblogicStatus/index.html",instance="muclhp522:9968",job="grok"} 0
WLSRESTARTFLAG{exported_instance="https://171.17.22.4:3012/weblogicStatus/index.html",instance="muclhp522:9968",job="grok"} 0
WLSRESTARTFLAG{exported_instance="https://171.17.22.3:3011/weblogicStatus/index.html",instance="muclhp522:9968",job="grok"} 0
WLSRESTARTFLAG{exported_instance="https://171.17.22.2:3012/weblogicStatus/index.html",instance="muclhp522:9968",job="grok"} 0
WLSRESTARTFLAG{exported_instance="https://171.17.22.1:3011/weblogicStatus/index.html",instance="muclhp522:9968",job="grok"} 0

I don't want the labels - instance and job in Prometheus output as this is causing difficulties in setting up alerts in alertmanager.

How can i hide them ? I don't want them to get printed.

Thanks
Priyotosh

Replace Function in Matched Captures

Hi,
Firstly Thank you for the wonderful work with this, It works really well.
I just wanted to check if there is a way wherein subtext within a capture group can be replaced with something??

For example, I have a lot of URL's being captured like:
/abc/def/ghi/number/123123/jkl/mno
/abc/def/ghi/number/123124/jkl/mno
/abc/def/ghi/number/987654/jkl/mno
/abc/def/ghi/number/654763/jkl/mno

Currently I do not need the digits in the URL, it is creating a lot of unique metrics because of this, is there a way where in post capturing, the URL can be replaced to something like this:
/abc/def/ghi/number/x/jkl/mno

This would also mean instead of 5 separate metrics with count 1, I would have 1 metric with count 5.

Thanks,
Varunn

Exporter enters high CPU state and stops processing log lines

Issue observed with 0.26, 0.27 and source build from master on April 25.

I have configured 3 instances of grok_exporter on CentOS. 2 instances run without issue, but 1 will only export metrics for a few minutes before entering a high CPU state. In high CPU state the exporter still responds to scrape requests. The backlog is growing at 3-4 rows per second. While operating normally, the "bad" instance grok_exporter_lines_processing_time_microseconds_total is < 500 µs, and this metric drops to 0 once the instance enters high cpu state.

What debug information can I collect to help investigate this issue?

Grok exporter show changes only after restart

We have configured Grok exporter to monitor errors from various system logs. But it seems changes are reflected once we restart the respective grok instance.

Please see the config.yml below:

global:
    config_version: 2
input:
    type: file
    path: /ZAMBAS/logs/Healthcheck/EFT/eftcl.log
    readall: true
    poll_interval_seconds: 5

grok:
    patterns_dir: ./patterns

metrics:
    - type: gauge
      name: EFTFileTransfers
      help: Counter metric example with labels.
      match: '%{WORD:Status}\s%{GREEDYDATA:FileTransferTime};\s\\%{WORD:Customer}\\%{WORD:OutboundSystem}\\%{GREEDYDATA:File};\s%{WORD:Operation};\s%{NUMBER:Code}'
      value: '{{.Code}}'
      cumulative: false
      labels:
          Customer: '{{.Customer}}'
          OutboundSystem: '{{.OutboundSystem}}'
          File: '{{.File}}'
          Status: '{{.Status}}'
          Operation: '{{.Operation}}'
          FileTransferTime: '{{.FileTransferTime}}'

    - type: gauge
      name: EFTFileSuccessfullTransfers
      help: Counter metric example with labels.
      match: 'Success\s%{GREEDYDATA:Time};\s\\%{WORD:Customer}\\%{WORD:OutboundSystem}\\%{GREEDYDATA:File};\s%{WORD:Operation};\s%{NUMBER:Code}'
      value: '{{.Code}}'
      cumulative: false

    - type: gauge
      name: EFTFileFailedTransfers
      help: Counter metric example with labels.
      match: 'Failed\s%{GREEDYDATA:Time};\s\\%{WORD:Customer}\\%{WORD:OutboundSystem}\\%{GREEDYDATA:File};\s%{WORD:Operation};\s%{NUMBER:Code}'
      value: '{{.Code}}'
      cumulative: false

server:
    port: 9845

Without restart it doesn't reflects correct matching patterns. Once I restart the grok instance it reflects perfectly.

I have used a parameter suggested in some diff issue "poll_interval_seconds: 5". But this doesn't helps me either.
Is there some parameter I am missing here ?

Thanks Priyotosh

Tailer is not streaming line if file is a symbolic link

Hi,

Seems like the Tailer is not pushing lines over the channel ( Lines() ) when the given filepath is a symbolic link.

I am instanciating the tailer this way:
logtailer := tailer.RunFseventFileTailer(filepath, false, true, logger)

the filepath is pointing to the file named access.log that is a symbolic link

-rw-r--r-- 1 dbenque dbenque  521 Feb 20 23:38 access0
lrwxrwxrwx 1 dbenque dbenque    7 Feb 20 23:38 access.log -> access0

At the same time I have process writting inside file access0. The lines finally arrive in one bulk when the process writting to the file access0 is terminated.

Would it be possible to support Symbolic Link for the tailer?

Note:

> uname -a
Linux ncelrnd0228 4.15.0-45-generic #48-Ubuntu SMP Tue Jan 29 16:28:13 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Thanks,
David

[tailer package] macOS bug

consider the following code inside a file named t.go

package main

import (
	"fmt"

	"github.com/fstab/grok_exporter/tailer"
)

func main() {
	t := tailer.RunFileTailer("hello", false, nil)

	for line := range t.Lines() {
		fmt.Println(line)
	}
}

the file hello contains the following

line1

now run t.go it will wait for new lines to be written to hello, it won't read anything since we passed false to RunFileTailer.

now add line2 to hello

$ echo "line2" >> hello

our program will output the following:

line2

Great, That's exactly what we're looking for, now let's add line3

$ echo "line3" >> hello

our program will output the following:

line1
line2
line3

not exactly what we wanted, instead of getting the newest line, we got the whole file.

env

Go 1.8
macOS 10.12.3
go test passes

grok - log rotation issue

I'm running grok (grok_exporter version: 0.2.6 (build date: 2018-10-08, branch: master, revision: 81c0afe, go version: go1.11.1, platform: linux-amd64) and it starts successfully and parses log files - but my log files are rotated after reaching certain size. When these log files get rotated, I see an error message and grok process gets killed.Am I doing something wrong?

error message:
Starting server on http://0.0.0.0:9142/metrics
error reading log lines: failed to watch /var/prod/logs/xxxx/haproxy/http_hap.log: open /var/prod/logs/xxxx/haproxy/http_hap.log: permission denied

grok log files:
-rwxr-xr-x. 1 xxxx yyyy 206714514 May 27 03:15 http_hap.log-20190527.gz
-rwxr-xr-x. 1 xxxx yyyy 215198130 May 28 03:46 http_hap.log-20190528.gz
-rwxr-xr-x. 1 xxxx yyyy 291469041 May 29 03:19 http_hap.log-20190529.gz
-rwxr-xr-x. 1 xxxx yyyy 694520310 May 29 10:03 http_hap.log

Count doesn't decreases even if there are no errors currently

Hello All,

We have configured Grok exporter to monitor errors from the web service logs. We see that even when there are NO errors it still prints the past count of errors.

We have used "gauge" as the metric type and polling the log file every 5 secs.

Please see the config.yml below:


global:
config_version: 2
input:
type: file
path: /ZAMBAS/logs/Healthcheck/AI/ai_17_grafana.log
readall: true
poll_interval_seconds: 5

grok:
patterns_dir: ./patterns

metrics:
- type: counter
name: OutOfThreads
help: Counter metric example with labels.
match: '%{GREEDYDATA} WARN!! OUT OF THREADS: %{GREEDYDATA}'

- type: counter
  name: OutOfMemory
  help: Counter metric example with labels.
  match: '%{GREEDYDATA}: Java heap space'

- type: gauge
  name: NoMoreEndpointPrefix
  help: Counter metric example with labels.
  match: '%{GREEDYDATA}: APPL%{NUMBER:val1}: IO Exception: Connection refused %{GREEDYDATA}'
  value: '{{.val1}}'
  cumulative: false


- type: gauge
  name: IOExceptionConnectionReset
  help: Counter metric example with labels.
  match: '   <faultstring>APPL%{NUMBER:val3}: IO Exception: Connection reset'
  value: '{{.val3}}'
  cumulative: false


- type: gauge
  name: IOExceptionReadTimedOut
  help: Counter metric example with labels.
  match: '   <faultstring>APPL%{NUMBER:val4}: IO Exception: Read timed out'
  value: '{{.val4}}'
  cumulative: false


- type: gauge
  name: FailedToConnectTo
  help: Counter metric example with labels.
  match: "   <faultstring>RUNTIME0013: Failed to connect to '%{URI:val5}"
  value: '{{.val5}}'
  cumulative: false

server:
port: 9244


Result:


grok_exporter_lines_matching_total{metric="FailedToConnectTo"} 0
grok_exporter_lines_matching_total{metric="IOExceptionConnectionReset"} 0
grok_exporter_lines_matching_total{metric="IOExceptionReadTimedOut"} 3
grok_exporter_lines_matching_total{metric="NoMoreEndpointPrefix"} 0
grok_exporter_lines_matching_total{metric="OutOfMemory"} 0
grok_exporter_lines_matching_total{metric="OutOfThreads"} 0


Say, for 1 hr there were No errors, still it shows '3' errors and when an error does occurs it keeps adding up. So in total it becomes 4 and so on..it keeps on adding :(

I want grok to show only the present data without adding previous values.

Please help us here on what am I doing wrong.

Thanks
Priyotosh

Time between loglines

As suggested in #46 the time between two loglines is quite often a useful statistic.

In one of the applications running on our cluster, we get loglines like the following :

[2018-11-26 11:12:26 +0000] INFO   com.somejavaclass.redacted.processingnode.verticle.ProcessingJobVerticle Received processing job with jobId: 9bfb55a0-ae31-42c6-a2df-00160e65986c
[2018-11-26 11:12:26 +0000] INFO   com.somejavaclass.redacted.processingnode.verticle.ProcessingJobVerticle Created path for source file: /data/input/0b487e4a-4d65-49b8-a321-4b8a1f83d6b8
[2018-11-26 11:12:26 +0000] FINE   com.somejavaclass.redacted.http.DownloadVerticle Received download message : {"com.somejavaclass.redacted.http.DownloadVerticle.save.path.key":"/data/input/0b487e4a-4d65-49b8-a321-4b8a1f83d6b8","com.somejavaclass.redacted.http.DownloadVerticle.download.url.key":"https://redacted.host.name/source/2F7dMYtNsRkMJGexWfLz-y/Eexz2sMfTiE89Gr536fSMc"}
[2018-11-26 11:12:26 +0000] FINE   com.somejavaclass.redacted.http.ZookeeperVerticle Received resolve url message.
[2018-11-26 11:12:26 +0000] FINE   com.somejavaclass.redacted.http.ZookeeperVerticle Url 'https://redacted.host.name/source/2F7dMYtNsRkMJGexWfLz-y/Eexz2sMfTiE89Gr536fSMc' is not an internal url, using it unmodified.
[2018-11-26 11:12:26 +0000] FINE   com.somejavaclass.redacted.http.DownloadVerticle Found https url, configuring all trusting ssl client options.
[2018-11-26 11:12:26 +0000] INFO   com.somejavaclass.redacted.http.DownloadVerticle Downloading file from 'https://redacted.host.name:-1/source/2F7dMYtNsRkMJGexWfLz-y/Eexz2sMfTiE89Gr536fSMc' and saving to: /data/input/0b487e4a-4d65-49b8-a321-4b8a1f83d6b8
[2018-11-26 11:12:26 +0000] WARNING io.netty.util.internal.logging.Slf4JLogger warn Failed to find a usable hardware address from the network interfaces; using random bytes: d2:41:fa:d0:d3:8c:b6:b9
[2018-11-26 11:12:26 +0000] INFO   com.somejavaclass.redacted.http.DownloadVerticle Beginning download...
[2018-11-26 11:12:29 +0000] FINE   com.somejavaclass.redacted.http.DownloadVerticle Closed file /data/input/0b487e4a-4d65-49b8-a321-4b8a1f83d6b8.
[2018-11-26 11:12:29 +0000] INFO   com.somejavaclass.redacted.processingnode.verticle.ProcessingJobVerticle Download finished from: https://redacted.host.name/source/2F7dMYtNsRkMJGexWfLz-y/Eexz2sMfTiE89Gr536fSMc. File saved at : /data/input/0b487e4a-4d65-49b8-a321-4b8a1f83d6b8

The time between Download started (11:12:26) and Download finished (11:12:29) for a job is actually really interesting to us.

For multiple events, my suggestion for the flow (least confusion and easiest to code) :

11:00 [event 2 fires] (without event 1), do nothing because we have nothing to time against
11:01 [event 1 fires] log time.
11:02 [event 2 fires] check for a tracked event 1 and store the time between now and then and output the metric
11:03 [event 1 fires] log time
11:04 [event 1 fires] log time, overwrite slot
11:05 [event 2 fires] check for a tracked event 1 and store the time between now and then and output the metric (1 minute)
11:06 [event 2 fires] check for a tracked event 1 and store the time between now and then and output the metric (2 minutes because 11:06-11:04)

Whilst this doesn't give the ability to correlate between events, it's the least surprising and easiest to code. If there is some kind of correlation ID and events might nest or fire asynchronously, a regex for correlation id might be a nice to have, but that requires more storage, and lots more metric buckets.

@fstab thoughts?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.