Giter VIP home page Giter VIP logo

signalfx-collectd-plugin's Introduction

ℹ️  SignalFx was acquired by Splunk in October 2019. See Splunk SignalFx for more information.

SignalFx Metadata Plugin

Provides metadata about host and will send collectd notifications to SignalFx if configured to do so.

Example Config:

LoadPlugin python
TypesDB "/opt/signalfx-collectd-plugin/types.db.plugin"
<Plugin python>
  ModulePath "/opt/signalfx-collectd-plugin"
  LogTraces true
  Interactive false
  Import "signalfx_metadata"
  <Module signalfx_metadata>
    URL "https://ingest.signalfx.com/v1/collectd"
    Token "<<<<<<INSERT_TOKEN_HERE>>>>>>"
    Notifications true
    NotifyLevel "OKAY"
    Utilization true
    Interval 10
  </Module>
</Plugin>

For metadata:

  • ProcessInfo: do we want to collect process information, true or false. Default is true.
  • Notifications: do we want to emit notifications from the plugin true or false. Default is false. Note, the plugin will send it's own metadata as events in addition to this.
  • URL: where to emit notifications via json to. The example url is the default. Supports multiple entries.
  • Token: api token from signalfx to authenticate. No default. Required for metadata unless talking through proxy. Supports multiple entries but cardinality must equal that of URLs.
  • Interval: how often you want the sfx plugin to collect and send data. Default is 10.
  • NotifyLevel: If you want to emit notifications beyond the ones generated by this plugin, set to the appropriate level. "OKAY" would mean all notifications are emitted. "ERROR" would just be error. "WARNING" would include "ERROR" and "WARNING".
  • Utilization: would you like the plugin to send in utilization metrics? Default is true.
  • PerCoreCPUUtil: would you like the plugin to send in utilization metrics for each processor? Default is false
  • Datapoints: would you like the plugin to send in metrics about max round trip time, plugin uptime and notification sending errors? Default is true.
  • ProcPath: specify an alternate proc path to parse for process information. Default is /proc.
  • EtcPath: specify an alternate etc path to parse for os release information. Default is /etc.

For DogstatsD support:

  • To enable reading of DogstatsD metrics, add a line similar to the following to your config inside the Module block DogStatsDPort 8126

signalfx-collectd-plugin's People

Contributors

almightyoatmeal avatar cep21 avatar charless-splunk avatar jeffreyc-splunk avatar keitwb avatar luciferous avatar mdubbyap avatar molner avatar mpetazzoni avatar tedoc2000 avatar theletterf avatar wt avatar yannyu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

signalfx-collectd-plugin's Issues

Issue with SIGCHLD handling

We are seeing an intermittent issue on some of our agents, where the signalfx-metadata plugin never sends data back correctly. Strangely, it doesn't seem to happen all the time, and sometimes a new server starts it without issues. It only affects this one plugin:

[2019-01-04 11:30:25] signalfx-metadata: executing SIGCHLD workaround
[2019-01-04 11:30:25] dogstatsd: plugin init <collectd_dogstatsd.DogstatsDConfig object at 0x1db5110>
[2019-01-04 11:30:25] dogstatsd: dogstatsd port listening not enabled
[2019-01-04 11:30:25] docker : Collecting stats about Docker containers from unix://var/run/docker.sock (API version 1.38; timeout: 3s).
[2019-01-04 11:30:25] check_capability: unsupported capability implementation. Some plugin(s) may require elevated privileges to work properly.
[2019-01-04 11:30:25] check_capability: unsupported capability implementation. Some plugin(s) may require elevated privileges to work properly.
[2019-01-04 11:30:25] Initialization complete, entering read-loop.
[2019-01-04 11:30:25] signalfx-metadata: found host ip-172-27-86-15.ec2.internal
[2019-01-04 11:30:25] Unhandled python exception in read callback: IOError: [Errno 10] No child processes
[2019-01-04 11:30:25] Traceback (most recent call last):
[2019-01-04 11:30:25]   File "/opt/signalfx-collectd-plugin/signalfx_metadata.py", line 976, in send
    send_datapoints()
[2019-01-04 11:30:25]   File "/opt/signalfx-collectd-plugin/signalfx_metadata.py", line 1505, in send_datapoints
    PLUGIN_UPTIME, get_linux_version(), platform.release(),
[2019-01-04 11:30:25]   File "/usr/lib64/python2.6/platform.py", line 1291, in release
    return uname()[2]
[2019-01-04 11:30:25]   File "/usr/lib64/python2.6/platform.py", line 1239, in uname
    processor = _syscmd_uname('-p','')
[2019-01-04 11:30:25]   File "/usr/lib64/python2.6/platform.py", line 996, in _syscmd_uname
    rc = f.close()
[2019-01-04 11:30:25] read-function of plugin `python.signalfx_metadata' failed. Will suspend it for 20.000 seconds.

collectd keeps suspending the plugin for longer and longer periods of time when this happens. I can't find evidence that it recovers afterwards.

Here's the line it's occuring:

PLUGIN_UPTIME, get_linux_version(), platform.release(),

From googling, it seems there is already code to try to fix this issue - and I can see it being executed with the line signalfx-metadata: executing SIGCHLD workaround in the log. I am wondering if there's some kind of race condition here, and it's not executing that code early enough for the plugin to be inited properly, or not executing it in another thread or something like that.

Heavy Memory usage up to 11GB

Hi All,

Before 0.29 we were experiencing heavy memory usage up to 4GB. Now we have a machine using version: 0.0.32. This machine has 11GB virtual memory for collectd process. Looks like the memory leak still there. Could you guys take a look?

Same information

  • Running on EC2 AWS-LINUX
  • R3 box with 30GB of ram.
  • The issue happen last Friday on 27th October.

Version

Verify version: cat /opt/signalfx-collectd-plugin/signalfx_metadata.py | grep VERSION | head -n 1
VERSION = "0.0.32"

Memory

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                          
30240 root      20   0 11.2g 213m  26m S  3.3  0.7   4:05.70 collectd 

SignalFC CollectD log

stance = host_mem_total, type = objects, type_instance = host-meta-data, message = 31417376
[2017-11-02 00:55:13] signalfx-metadata: till next metadata 86987.9999981 seconds
[2017-11-02 07:22:07] signalfx-metadata: unsuccessful response: <urlopen error _ssl.c:478: The handshake operation timed out>
[2017-11-02 07:37:36] signalfx-metadata: unsuccessful response: <urlopen error _ssl.c:478: The handshake operation timed out>
[2017-11-02 14:50:16] write_http plugin: HTTP Error code: 0
[2017-11-02 14:50:16] write_http plugin: curl_easy_perform failed with status 28: Operation timed out after 3001 milliseconds with 0 bytes received
[ec2-user@ip-172-30-199-77 signalfx-collectd-plugin]$ 

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.