Giter VIP home page Giter VIP logo

log-analytics-prometheus's Introduction

Prometheus + Loki + Grafana based Log Analytics and Metrics for JFrog Artifactory, Xray

The JFrog Log Analytics and Metrics solution using Prometheus consists of three segments,

  1. Prometheus - the component where metrics data gets ingested
  2. Loki - the component where log data gets ingested
  3. Grafana - the component where data visualization is achieved via prebuilt dashboards

Pre-Requisites

  1. Working and configured Kubernetes Cluster - Amazon EKS / Google GKE / Azure AKS / Docker Desktop / Minikube

    1. Recommended Kubernetes Version 1.25.2 and above
    2. For Google GKE, refer GKE Guide
    3. For Amazon EKS, refer EKS Guide
    4. For Azure AKS, refer AKS Guide
    5. For Docker Desktop and Kubernetes, refer DOCKER Guide
  2. 'kubectl' utility on the workstation which is capable of connecting to the Kubernetes cluster

    1. For Installation and usage refer KUBECTL Guide
  3. HELM v3 Installed

    1. For Installation and usage refer HELM Guide
  4. Versions supported and Tested:

    Jfrog Platform: 10.17.3

    Artifactory: 7.77.8

    Xray: 3.92.7

    Prometheus: 2.51.0

    Grafana: 10.4.0

    Loki: 2.9.6

Read me before installing

Important Note: This version replaces all previous implementations. This version is not an in-place upgrade to the existing solution from JFrog but is a full reinstall. Any dashboard customizations done on previous versions will need to be redone after this install.

This guide assumes the implementer is performing new setup, Changes to handle install in an existing setup will be highlighted where applicable.
    if prometheus is already installed and configured, we recommend to have the existing prometheus release name handy.
    If Loki is already installed and configured, we recommend to have its service URL handy.

If Prometheus and Loki are already available you can skip the installation section and proceed to Configuration Section.

Warning

The old docker registry partnership-pts-observability.jfrog.io, which contains older versions of this integration is now deprecated. We'll keep the existing docker images on this old registry until August 1st, 2024. After that date, this registry will no longer be available. Please helm upgrade your JFrog kubernetes deployment in order to pull images as specified on the above helm value files, from the new releases-pts-observability-fluentd.jfrog.io registry. Please do so in order to avoid ImagePullBackOff errors in your deployment once this registry is gone.

Installation

Installing Prometheus, Loki and Grafana on Kubernetes

The Prometheus Community kube-prometheus-stack helm chart allows the creation of Prometheus instances and includes Grafana. The Grafana Community grafana helm chart allows the creation of Loki instances and includes Grafana which can link to prometheus.

Once the Pre-Requisites are met, to install Prometheus Kubernetes stack:

Create the Namespace required for Prometheus Stack deployment

export INST_NAMESPACE=jfrog-plg

We will use jfrog-plg as the namespace throughout this document. That said, you can use a different or existing namespace instead by setting the above variable.

kubectl create namespace $INST_NAMESPACE
kubectl config set-context --current --namespace=$INST_NAMESPACE  

Install the Prometheus chart

Note: This installation comes with a Grafana installation

Add the required Helm Repositories:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

Install the chart:

helm upgrade --install "prometheus" prometheus-community/kube-prometheus-stack -n $INST_NAMESPACE
* "prometheus" here is the value that needs to be used against the value for "release_name" in the configuration section

For Docker Desktop, run this additional command to correct the mount path propagation for prometheus node-exporter component, An error event will be appearing as follows "Error: failed to start container "node-exporter": Error response from daemon: path / is mounted on / but it is not a shared or slave mount"

kubectl patch ds prometheus-prometheus-node-exporter --type "json" -p '[{"op": "remove", "path" : "/spec/template/spec/containers/0/volumeMounts/2/mountPropagation"}]' -n $INST_NAMESPACE

Install the Loki chart

Add the required Helm Repositories:

helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

Install the chart:

helm upgrade --install "loki" --values helm/loki-values.yaml grafana/loki --version 5.48.0 -n $INST_NAMESPACE

💡 The above helm command is hard-coding the loki chart version to 5.48.0, since we only tested it with loki 2.9.6. loki 3.x charts (v6.0.x and up) have a breaking change, so if you would like to install loki 3.x please visit the loki's official docs and provide your own loki-values.yaml

* "loki" will be the service name, the url to access loki as a datasource can be visualised as http://<service_name>.<namespace>:<port>
      ex: http://loki.$INST_NAMESPACE:3100 will be the "loki_url" value

* version 2.9.6 is the most recent loki version at the time of writing the document
      if there is a need to deploy this exact version, change the version value in "--set loki.image.tag=my_desired_version" to your desired version.

JFrog Platform + Metrics via Helm ⎈

Ensure Jfrog repo is added to helm.

helm repo add jfrog https://charts.jfrog.io
helm repo update

To configure and install JFrog Platform with Prometheus metrics being exposed use our file helm/jfrog-platform-values.yaml to expose a metrics and new service monitor to Prometheus.

JFrog Platform ⎈:

helm upgrade --install jfrog-platform jfrog/jfrog-platform \
       -f helm/jfrog-platform-values.yaml \
       -n $INST_NAMESPACE

If you are installing in the same cluster with the deprecated solution, Use the same namespace as the previous one instead of jfrog-plg above.

Artifactory / Artifactory HA + Metrics via Helm ⎈

For configuring and installing Artifactory Pro/Pro-x use the artifactory-values.yaml file.

For configuring and installing Enterprise/Ent+ use the artifactory-ha-values.yaml file.

Before starting Artifactory or Artifactory HA installtion generate join and master keys for the installation:

export JOIN_KEY=$(openssl rand -hex 32)
export MASTER_KEY=$(openssl rand -hex 32)

Then helm install the Artifactory or Artifactory HA charts:

Artifactory ⎈:

helm install artifactory chart (using the above generated join and master keys).

💡Note: You need to be at the root of this repository folder to have helm/artifactory-values.yaml file available for the following command

helm upgrade --install artifactory jfrog/artifactory \
       --set artifactory.masterKey=$MASTER_KEY \
       --set artifactory.joinKey=$JOIN_KEY \
       -f helm/artifactory-values.yaml \
       -n $INST_NAMESPACE

If you are installing in the same cluster with the deprecated solution, Use the same namespace as the previous one instead of jfrog-plg above.

Artifactory-HA ⎈:

helm install artifactory-ha chart (using the above generated join and master keys).

💡Note: You need to be at the root of this repository folder to have helm/artifactory-ha-values.yaml file available for the following command

helm upgrade --install artifactory-ha jfrog/artifactory-ha \
       --set artifactory.masterKey=$MASTER_KEY \
       --set artifactory.joinKey=$JOIN_KEY \
       -f helm/artifactory-ha-values.yaml \
       -n $INST_NAMESPACE

💡Note: If you are installing in the same cluster with the deprecated solution, Use the same namespace as the previous one instead of jfrog-plg above. Note the above examples are only references you will need additional parameters to configure TLS, binary blob storage, or other common Artifactory features.

This will complete the necessary configuration for Artifactory and expose a new service monitor servicemonitor-artifactory to expose metrics to Prometheus.

Xray + Metrics via Helm ⎈

To configure and install Xray with Prometheus metrics being exposed use our file helm/xray-values.yaml to expose a metrics and new service monitor to Prometheus.

Xray ⎈:

Generate master keys for the Xray installation:

export XRAY_MASTER_KEY=$(openssl rand -hex 32)

Use the same JOIN_KEY from the Artifactory installation, in order to connect Xray to Artifactory.

💡Note: You need to be at the root of this repository folder to have helm/xray-values.yaml file available for the following command

# getting Artifactory URL
export RT_SERVICE_IP=$(kubectl get svc -n $INST_NAMESPACE artifactory-artifactory-nginx -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

# helm install xray
helm upgrade --install xray jfrog/xray --set xray.jfrogUrl=http://$RT_SERVICE_IP \
       --set xray.masterKey=$XRAY_MASTER_KEY \
       --set xray.joinKey=$JOIN_KEY \
       --set xray.openMetrics.enabled=true \
       --set xray.metrics.enabled=true \
       -f helm/xray-values.yaml \
       -n $INST_NAMESPACE

If you are installing in the same cluster with the deprecated solution, Use the same namespace as the previous one instead of jfrog-plg above.

Configuration

To Assess the setup for Prometheus

Use 'kubectl port forward' as mentioned below in a separate terminal window

   kubectl port-forward service/prometheus-operated 9090:9090 -n $INST_NAMESPACE

Go to the web UI of the Prometheus instance "http://localhost:9090" and verify "Status -> Service Discovery", the list shows the new ServiceMonitor for Artifactory or Xray or both, as shown here:targets

Deafult user/password for Prometheus is -> UNAME/PASSWD

To Assess the setup for Grafana

use 'kubectl port forward' as mentioned below in a separate terminal window

   kubectl port-forward service/prometheus-grafana 3000:80 -n $INST_NAMESPACE
  1. Open your Grafana on a browser at "http://localhost:3000" (grafana default credentials are "admin" / "prom-operator").

  2. Go to "Configuratoin" -> "Data Sources" on the left menu:

    datasource

  3. Add your Prometheus instance and Loki Instance as datasources:

    • When adding Loki data source, specify url value as http://loki:3100
    • Prometheus data dource might be added from config automatically. If not, add Prometheus data dource and specify url value as http://prometheus-kube-prometheus-prometheus:9090/ datasource
  4. While specifying datasource url for Loki and Prometheus, please test and confirm that the connection is successful using the Save & Test button at the bottom of the adding data source page:

    datasource

    After adding both Loki and Prometheus Data Sources your "Configuration" -> "Data Sources" page should look like the following: datasource

Grafana Dashboard

Example dashboards are included in the grafana directory. These dashboards needs to be imported to the grafana. These include:

After downlowding the dashboards go to "Dashboards" -> "Import":

dashboards

Pick Uplaod JSON file and upload Artifactory and Xray dashboards files that you downloaded in the previous step.

Import the Dashboards and select the appropriate sources (Loki and Prometheus)

dashboards

References

log-analytics-prometheus's People

Contributors

benharosh avatar betarelease avatar danielmkn avatar giri-vsr avatar gitta-jfrog avatar hmitgang avatar jefferyfry avatar maheshjfrog avatar mjtrangoni avatar peters95 avatar rahulsadanandan avatar shrajfr12 avatar turhsus avatar vasukinjfrog avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

log-analytics-prometheus's Issues

Messy configuration

Hello jfrog colleagues,

I consider myself a helm decent user and even for me it's still confusing how you mix prometheus with loki. I will point out some improvements that can be done to this repo:

  1. repo name should be log-analytics, or observability, not log-analytics-prometheus
  2. https://github.com/jfrog/log-analytics-prometheus/blob/master/helm/jfrog-platform-values.yaml#L3 this should be the repository name
  3. https://github.com/jfrog/log-analytics-prometheus/blob/master/helm/jfrog-platform-values.yaml#L5 this should be loki, not prometheus
  4. https://github.com/jfrog/log-analytics-prometheus/blob/master/helm/jfrog-platform-values.yaml#L34 this should move to templates folder and be configured using values.yaml instead of big complex config for non proficient users.
  5. https://github.com/jfrog/log-analytics-prometheus/blob/master/helm/jfrog-platform-values.yaml#L16 If I had the chance, I would deploy this in an official remote instead of "partnership-public-images.jfrog.io". On my enterprise airgapped setup, I will have another firewall to request
  6. https://github.com/jfrog/log-analytics-prometheus/blob/master/helm/jfrog-platform-values.yaml#L14 I have seem this already with old log strategy. I know this is advanced, but name one customer deploying on kubernetes that doesn't want this. Should also go into values.

I will be completing the instrumentation with current documented configuration, but I hope my insights helps easier adoption.

Use FluentD as a pod

I need to use FluentD as a separate deployment in EKS using HELM, in other words, how can I use the same configurations of this repo but with using the FluentD HELM chart? Where to copy the contents of fluent.conf.rt in the HELM chart of FluentD?

Support for Kustomize

Hello,

At the moment we deploy Artifactory HA using Kustomize and other plugins such as khelm and ksops. That being said, there are several resources in this project that can be bundled perfectly as a kustomize component:

  • the custom init container and custom sidecar container can be turned into patches for primary and member STS manifests
  • new resources (service and serviceMonitor) can be added as plain resources
  • fluent.conf.rt and the grafana dashboards can be placed into configMaps via ConfigMapGenerators

Would it be possible to provide the contents of this repo as a kustomize component?

td-agent log error "broken pipe"

hi
When I run this process normally for about an hour or two, td-agent hangs up. Prometheus shows that this instance has been down. When I check the td-agent log, it shows the following picture, please help me Take a look, thanks
image

Artifactory metrics not being showed

Hi,

I was trying to use the fluentd as a exporter of the JFrog Artifactory metrics to use with prometheus. I followed the instructions of the readme file:

  1. Use the appropriate FluentD configuration file and copy it to /etc/td-agent/td-agent.conf.
    fluent.conf.rt - Artifactory version 7 server
    fluent.conf.rt6 - Artifactory version 6 server
    fluent.conf.xray - Xray server (3.x+)
  2. Restart td-agent.

But this doesn't show the metrics when I display it on the browser by using the IP and the port that is specified.

image

But, if I run the the td-agent with the following command:

td-agent -c fluent.conf.rt6

It works as expected, as is shown on the following screenshot.

image

firewall issues

ever since you migrated from partnership-public-images.jfrog.io to releases-pts-observability-fluentd.jfrog.io, I have image pull issues due to firewall.

I have been fighting with my network/security team to whitelist all redirects, but this is causing a lot of friction between network/security team and my team. they have the firewall, they have the firewall logs and i dont have access. they say nothing is blocked but it is.

I have managed to whitelist the following:

releases-pts-observability-fluentd.jfrog.io  
releases.jfrog.io  
endpointdns-prod-use1-lb.jfrog.io  
k8s-jfrogsaa-jfrogsaa-fb6f041eda-44b5904933c69ac1.elb.us-east-1.amazonaws.com
jfrog-prod-use1-shared-virginia-main.s3.amazonaws.com  
s3-1-w.amazonaws.com  
s3-w.us-east-1.amazonaws.com
jfrog-prod-use1-shared-virginia-main.s3.amazonaws.com 
s3-1-w.amazonaws.com 

but no success... question is, why this image is not on releases-docker.jfrog.io like all the images on the official charts?

Loki 2.6.1 not starting (failed to create memberlist)

Trying to run Loki through the current version of Helm deployment. Startup of the loki pod is failing with the error message:

log

msg=“module failed” module=memberlist-kv error=“invalid service state: Stopping, expected: Running”

Same issue seen here

Here I assume that your current namespace is jfrog-plg
if not

kc config set-context --current --namespace=jfrog-plg

Pull config

hl get values loki  -o yaml > values.loki.yaml

Fix by adding those line to value.yml

loki:
[...]
  config:
    memberlist:
      bind_addr:
        - ${MY_POD_IP}

  ingester:
    extraArgs:
    - -config.expand-env=true
    extraEnv:
      - name: MY_POD_IP
        valueFrom:
          fieldRef:
            fieldPath: status.podIP

Update install

hl upgrade --install "loki" grafana/loki-stack -f values.loki.yaml

Prometheus promtool check

Hi @peters95,

Prometheus's promtool is reporting the following metric issues, as they follow the OpenMetrics standards.

# curl -s http://FQDN:24231/metrics  | promtool check metrics  2>&1 | grep -v camelCase
fluentd_output_status_emit_count non-histogram and non-summary metrics should not have "_count" suffix
fluentd_output_status_flush_time_count non-histogram and non-summary metrics should not have "_count" suffix
fluentd_output_status_retry_count non-histogram and non-summary metrics should not have "_count" suffix
fluentd_output_status_rollback_count non-histogram and non-summary metrics should not have "_count" suffix
fluentd_output_status_slow_flush_count non-histogram and non-summary metrics should not have "_count" suffix
fluentd_output_status_write_count non-histogram and non-summary metrics should not have "_count" suffix
fluentd_status_retry_count non-histogram and non-summary metrics should not have "_count" suffix
jfrog_rt_access counter metrics should have "_total" suffix
jfrog_rt_access_audit counter metrics should have "_total" suffix
jfrog_rt_data_download counter metrics should have "_total" suffix
jfrog_rt_data_upload counter metrics should have "_total" suffix
jfrog_rt_log_level counter metrics should have "_total" suffix
jfrog_rt_req counter metrics should have "_total" suffix
jfrog_rt_service_message counter metrics should have "_total" suffix

also the following metrics are reporting camelCase usage,

# curl -s http://FQDN:24231/metrics  | promtool check metrics  2>&1 | grep -e camelCase | sort | uniq
jfrog_rt_data_download label names should be written in 'snake_case' not 'camelCase'
jfrog_rt_data_upload label names should be written in 'snake_case' not 'camelCase'
jfrog_rt_req label names should be written in 'snake_case' not 'camelCase'

[Bug] Metrics missing (Fluentd regex not really working)

Issue

I installed calyptia-fluentd on a RHEL 7 machine and run the Fluentd agent with the provided Fluentd configuration (I only changed the path to the Artifactory log files). When I test the Fluentd metric endpoint with curl localhost:24231/metrics I only get 5 metrics:

  • app_disk_used_bytes
  • app_disk_free_bytes
  • sys_memory_used
  • sys_memory_free
  • sys_cpu_ratio

Because this metrics are missing, the Grafana dashboard have a lot of empty panels and is useless.

How to reproduce

  • Install calyptia-fluentd agent on a RHEL 7 machine
  • Download the Fluentd configuration file
  • Change some paths to the Artifactory log files
  • Run the agent
  • Test the metric endpoint by running curl localhost:24231/metrics

What you expect

I get metrics for all log lines from the file artifactory-metrics.log

  • jfrt_runtime_heap_freememory_bytes
  • jfrt_runtime_heap_maxmemory_bytes
  • jfrt_runtime_heap_totalmemory_bytes
  • jfrt_runtime_heap_processors_total
  • jfrt_db_connections_active_total
  • jfrt_db_connections_idle_total
  • jfrt_db_connections_max_active_total
  • jfrt_db_connections_min_idle_total
  • jfrt_projects_active_total
  • jfrt_storage_current_total_size_bytes
  • app_disk_used_bytes
  • app_disk_free_bytes

and observability-metrics.log

  • app_io_counters_read_bytes
  • app_io_counters_write_bytes
  • app_self_metrics_calc_seconds
  • app_self_metrics_total
  • go_memstats_heap_in_use_bytes
  • go_memstats_heap_allocated_bytes
  • go_memstats_heap_idle_bytes
  • go_memstats_heap_objects_total
  • go_memstats_heap_reserved_bytes
  • go_memstats_gc_cpu_fraction_ratio
  • go_routines_total
  • sys_cpu_ratio
  • sys_load_1
  • sys_load_5
  • sys_load_15
  • sys_memory_used_bytes
  • sys_memory_free_bytes

I really like Prometheus, Loki and Grafana and I hope jFrog gives this more attention (working & better documentations)

unexpected error error_class=Errno::EADDRINUSE error="Address already in use - bind(2) for 0.0.0.0:24224"

I have used the same td-agent.conf file provided in this repo: fluent.conf.rt to collect application metrics of jfrog artifactory on a centos VM. but when I start td-agent, I keep seeing these below errors that 24224 port is already in use.

2021-02-01 23:58:18 +0000 [error]: #0 unexpected error error_class=Errno::EADDRINUSE error="Address already in use - bind(2) for 0.0.0.0:24224"
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/lib/ruby/2.7.0/socket.rb:201:in `bind'
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/lib/ruby/2.7.0/socket.rb:201:in `listen'
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/plugin_helper/server.rb:357:in `server_create_tcp_socket'
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/plugin_helper/server.rb:212:in `server_create_for_tcp_connection'
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/plugin_helper/server.rb:92:in `server_create_connection'
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/plugin/in_forward.rb:172:in `start'
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/root_agent.rb:200:in `block in start'
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/root_agent.rb:189:in `block (2 levels) in lifecycle'
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/root_agent.rb:188:in `each'
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/root_agent.rb:188:in `block in lifecycle'
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/root_agent.rb:175:in `each'
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/root_agent.rb:175:in `lifecycle'
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/root_agent.rb:199:in `start'
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/engine.rb:248:in `start'
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/engine.rb:147:in `run'
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/supervisor.rb:603:in `block in run_worker'
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/supervisor.rb:840:in `main_process'
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/supervisor.rb:594:in `run_worker'
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/command/fluentd.rb:361:in `<top (required)>'
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:72:in `require'
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:72:in `require'
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/bin/fluentd:8:in `<top (required)>'
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/bin/fluentd:23:in `load'
  2021-02-01 23:58:18 +0000 [error]: #0 /opt/td-agent/bin/fluentd:23:in `<top (required)>'
  2021-02-01 23:58:18 +0000 [error]: #0 /sbin/td-agent:7:in `load'
  2021-02-01 23:58:18 +0000 [error]: #0 /sbin/td-agent:7:in `<main>'
2021-02-01 23:58:18 +0000 [error]: #0 unexpected error error_class=Errno::EADDRINUSE error="Address already in use - bind(2) for 0.0.0.0:24224"
  2021-02-01 23:58:18 +0000 [error]: #0 suppressed same stacktrace
2021-02-01 23:58:18 +0000 [info]: Worker 0 finished unexpectedly with status 1
2021-02-01 23:58:19 +0000 [info]: adding filter pattern="jfrog.rt.**.service" type="record_transformer"
2021-02-01 23:58:19 +0000 [info]: adding filter pattern="jfrog.rt.**.service" type="parser"
2021-02-01 23:58:19 +0000 [info]: adding filter pattern="jfrog.rt.artifactory.request" type="record_transformer"
2021-02-01 23:58:19 +0000 [info]: adding filter pattern="jfrog.rt.frontend.request" type="record_transformer"
2021-02-01 23:58:19 +0000 [info]: adding filter pattern="jfrog.rt.metadata.request" type="record_transformer"
2021-02-01 23:58:19 +0000 [info]: adding filter pattern="jfrog.**" type="record_transformer"
2021-02-01 23:58:19 +0000 [info]: adding filter pattern="jfrog.callhome" type="record_transformer"
2021-02-01 23:58:19 +0000 [info]: adding match pattern="jfrog.callhome" type="http"
2021-02-01 23:58:19 +0000 [warn]: #0 Status code 503 is going to be removed from default `retryable_response_codes` from fluentd v2. Please add it by yourself if you wish
2021-02-01 23:58:19 +0000 [info]: adding filter pattern="jfrog.rt.artifactory.request" type="prometheus"
2021-02-01 23:58:20 +0000 [info]: adding filter pattern="jfrog.rt.artifactory.service" type="prometheus"
2021-02-01 23:58:20 +0000 [info]: adding filter pattern="jfrog.rt.artifactory.access" type="prometheus"
2021-02-01 23:58:20 +0000 [info]: adding filter pattern="jfrog.rt.access.audit" type="prometheus"
2021-02-01 23:58:20 +0000 [info]: adding source type="prometheus"
2021-02-01 23:58:20 +0000 [info]: adding source type="monitor_agent"
2021-02-01 23:58:20 +0000 [info]: adding source type="forward"
2021-02-01 23:58:20 +0000 [info]: adding source type="prometheus_monitor"
2021-02-01 23:58:20 +0000 [info]: adding source type="prometheus_output_monitor"
2021-02-01 23:58:20 +0000 [info]: adding source type="prometheus_tail_monitor"
2021-02-01 23:58:20 +0000 [info]: adding source type="tail"
2021-02-01 23:58:20 +0000 [info]: adding source type="tail"
2021-02-01 23:58:20 +0000 [info]: adding source type="tail"
2021-02-01 23:58:20 +0000 [info]: adding source type="tail"
2021-02-01 23:58:20 +0000 [info]: adding source type="tail"
2021-02-01 23:58:20 +0000 [info]: adding source type="tail"
2021-02-01 23:58:20 +0000 [info]: adding source type="tail"
2021-02-01 23:58:20 +0000 [info]: adding source type="tail"
2021-02-01 23:58:20 +0000 [info]: adding source type="tail"
2021-02-01 23:58:20 +0000 [info]: adding source type="tail"
2021-02-01 23:58:20 +0000 [info]: adding source type="tail"
2021-02-01 23:58:20 +0000 [info]: adding source type="tail"
2021-02-01 23:58:20 +0000 [info]: adding source type="tail"
2021-02-01 23:58:20 +0000 [info]: adding source type="exec"
2021-02-01 23:58:20 +0000 [info]: adding source type="exec"
2021-02-01 23:58:20 +0000 [info]: #0 starting fluentd worker pid=25680 ppid=25342 worker=0
2021-02-01 23:58:20 +0000 [info]: #0 [access_security_audit_tail] following tail of /app/jfrog/artifactory/var/log/access-security-audit.log
2021-02-01 23:58:20 +0000 [info]: #0 [artifactory_access_tail] following tail of /app/jfrog/artifactory/var/log/artifactory-access.log
2021-02-01 23:58:20 +0000 [info]: #0 [router_request_tail] following tail of /app/jfrog/artifactory/var/log/router-request.log
2021-02-01 23:58:20 +0000 [info]: #0 [metadata_request_tail] following tail of /app/jfrog/artifactory/var/log/metadata-request.log
2021-02-01 23:58:20 +0000 [info]: #0 [frontend_request_tail] following tail of /app/jfrog/artifactory/var/log/frontend-request.log
2021-02-01 23:58:20 +0000 [info]: #0 [artifactory_request_tail] following tail of /app/jfrog/artifactory/var/log/artifactory-request.log
2021-02-01 23:58:20 +0000 [info]: #0 [access_request_tail] following tail of /app/jfrog/artifactory/var/log/access-request.log
2021-02-01 23:58:20 +0000 [info]: #0 [router_traefik_tail] following tail of /app/jfrog/artifactory/var/log/router-traefik.log
2021-02-01 23:58:20 +0000 [info]: #0 [router_service_tail] following tail of /app/jfrog/artifactory/var/log/router-service.log
2021-02-01 23:58:20 +0000 [info]: #0 [metadata_service_tail] following tail of /app/jfrog/artifactory/var/log/metadata-service.log
2021-02-01 23:58:20 +0000 [info]: #0 [frontend_service_tail] following tail of /app/jfrog/artifactory/var/log/frontend-service.log
2021-02-01 23:58:20 +0000 [info]: #0 [artifactory_service_tail] following tail of /app/jfrog/artifactory/var/log/artifactory-service.log
2021-02-01 23:58:20 +0000 [info]: #0 [access_service_tail] following tail of /app/jfrog/artifactory/var/log/access-service.log
2021-02-01 23:58:20 +0000 [info]: #0 listening port port=24224 bind="0.0.0.0"
2021-02-01 23:58:20 +0000 [error]: #0 unexpected error error_class=Errno::EADDRINUSE error="Address already in use - bind(2) for 0.0.0.0:24224"
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/lib/ruby/2.7.0/socket.rb:201:in `bind'
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/lib/ruby/2.7.0/socket.rb:201:in `listen'
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/plugin_helper/server.rb:357:in `server_create_tcp_socket'
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/plugin_helper/server.rb:212:in `server_create_for_tcp_connection'
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/plugin_helper/server.rb:92:in `server_create_connection'
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/plugin/in_forward.rb:172:in `start'
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/root_agent.rb:200:in `block in start'
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/root_agent.rb:189:in `block (2 levels) in lifecycle'
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/root_agent.rb:188:in `each'
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/root_agent.rb:188:in `block in lifecycle'
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/root_agent.rb:175:in `each'
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/root_agent.rb:175:in `lifecycle'
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/root_agent.rb:199:in `start'
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/engine.rb:248:in `start'
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/engine.rb:147:in `run'
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/supervisor.rb:603:in `block in run_worker'
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/supervisor.rb:840:in `main_process'
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/supervisor.rb:594:in `run_worker'
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/lib/fluent/command/fluentd.rb:361:in `<top (required)>'
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:72:in `require'
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:72:in `require'
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.2/bin/fluentd:8:in `<top (required)>'
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/bin/fluentd:23:in `load'
  2021-02-01 23:58:20 +0000 [error]: #0 /opt/td-agent/bin/fluentd:23:in `<top (required)>'
  2021-02-01 23:58:20 +0000 [error]: #0 /sbin/td-agent:7:in `load'
  2021-02-01 23:58:20 +0000 [error]: #0 /sbin/td-agent:7:in `<main>'

I checked for processes which are using this port to kill them. But couldn't find any. Also, tried opening range of ports 24210 to 24250. Still did not work. Any suggestions on how to resolve or troubleshoot this issue? Thanks!

Feature request - Replacement or alternative to jfrog_rt_service_message_total Prometheus metric

Description:

jfrog_rt_service_message_total parameter generates a lot of unnecessary data that has to be stored in our metrics endpoints for no real gain when these lines are difficult to query and somewhat of an antipattern for time series metrics.

Expected solution:

We would propose a replacement or alternative that can be toggled to instead emit categories of log line (SECURITY, ACCESS, GARBAGE COLLECTION) and their severity level (INFO,DEBUG,WARNING etc.) where other methods are being used to obtain high cardinality log lines.

Possible workaround:

Disable the jfrog_rt_service_message_total parameter.

<metric>
     name jfrog_rt_service_message_total
     type counter
     desc artifactory service message
     <labels>
       host ${hostname}
       message ${message}
     </labels>
   </metric>

Repo and Artifact labels not present

For the request metrics (jfrog_rt_data_download_total, jfrog_rt_data_upload_total and jfrog_rt_req_total) the repo and artifact labels are always empty.

I am trying to use this with the artifactory-ha chart version 107.27.9.

Looking at the fluentd.conf.rt it seems to expect the repo related calls to have a request uri that beings with /api/download . However this does not appear to be how downloads look in the request log. Here is an example one from my logs:

2021-12-04T00:13:57.470Z|12a8601c7feea529|10.240.8.115|anonymous|GET|/api/npm/npm-remote/bower/-/bower-1.8.13.tgz|200|-1|4376294|78|npm/3.10.10 node/v6.17.1 linux x64

Add support for fluentd image to run as non-root

Description:

Currently, in the fluentd image, the fluentd client is running as root and writing to the container image filesystem. When using a secure Kubernetes it is not possible to not run as root and not expect to write into the container image root filesystem.

Link to the fluentd image

Expected solution:

Add support to run the flunetd client as non-root and not writing to the container image root filesystem for cases when using a secure K8s cluster.

jfrt_http_connections_* metrics no longer collected by sidecar

Hello,

On the open metrics artifactory page (https://www.jfrog.com/confluence/display/JFROG/Open+Metrics), the following metrics are defined:

jfrt_http_connections_available_total | Total number of available outbound HTTP connections
jfrt_http_connections_leased_total | Total number of available leased HTTP connections
jfrt_http_connections_pending_total | Total number of available pending HTTP connections
jfrt_http_connections_max_total | Total number of maximum HTTP connections

It appears that the sidecar no longer gathers this data. Is there a reason for this?

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.