agent-install-helpers's People
agent-install-helpers's Issues
Agent install scripts use incorrect group ID for docker group
$ cat install_cht_perfmon.sh | grep -i dock
# make sure docker group exists so that agent has access if docker
groupadd -f docker
useradd cht_agent -m -U -G docker
This adds a user-land group for docker (with GID > 1000). Docker group needs to be a system group (< 1000) to install:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=784690
Installing the CHT agent prior to installing docker will cause docker installation to fail.
Installer hangs forever
install_cht_perfmon.sh hangs forever in an unbounded while loop.
The failure is at line of 240 of install_cht_perfmon.sh:
if [ $counter -gt $user_defined_frequency ]; then
$user_defined_frequency is null here. Since this occurs in an unbounded while loop, the script never completes.
$user_defined_frequency is defined with the line:
user_defined_frequency=$(grep -A 1 "LoadPlugin cpu" /etc/chtcollectd/collectd.conf | grep Int | awk '{print $2}')
Which does not return anything. This is due to the fact that collectd uses the default interval instead of defining it explicitly.
collectd is installed in this script with a custom RPM that does not respect the preexistence of the configuration file.
/etc/chtcollectd/collectd.conf:
<LoadPlugin cpu>
Interval
What this means is that I can't even put the file in place with an interval defined before running the script. The installer is unusable in its current form due to collectd.conf not defining an interval for the cpu plugin.
What connectivity does the CloudHealth agent require?
Hello,
Can you please advise what outbound/inbound ports/ip-addresses/URLs the agent need to be opened to function properly?
Also, if it runs on AWS/EC2 does it require to have any Security group inbound rules?
Thank you,
Yossi C.
Automate or provide README.md guidance to get latest scripts
The README.md revers specifically to v14 of these scripts. It could be helpful to mention the available of more recent versions, namely the current v26, and how to determine the latest build.
collect config is broken
Operating System
Ubuntu Bionic Beaver 18.04
Installation
sh install_cht_perfmon.sh 20 <key> aws disable-update
Result
/etc/chtcollectd/collectd.conf
Collectd config file looks bogus, some placeholders are not being replaced correctly
Hostname "i-xxx"
FQDNLookup false
Interval 10
BaseDir "/var/lib/chtcollectd"
PIDFile "/var/run/chtcollectd.pid"<LoadPlugin cpu>
Interval 10
</LoadPlugin>
<LoadPlugin df>
Interval 10
</LoadPlugin>
<LoadPlugin memory>
Interval 10
</LoadPlugin>
<LoadPlugin CHT_PERFMON_INTERFACE_ENABLED>
Interval CHT_PERFMON_INTERFACE_SAMPLE_INTERVAL
</LoadPlugin>
<LoadPlugin CHT_PERFMON_DISK_ENABLED>
Interval CHT_PERFMON_DISK_SAMPLE_INTERVAL
</LoadPlugin>
LoadPlugin csv
<LoadPlugin aggregation>
Interval 10
</LoadPlugin><Plugin aggregation>
<Aggregation>
Plugin "cpu"
Type "cpu"
GroupBy "Host"
GroupBy "TypeInstance"
CalculateAverage true
SetPlugin "cpu"
SetPluginInstance "%{aggregation}"
</Aggregation>
</Plugin><Plugin df>
ReportByDevice false
ValuesAbsolute true
ValuesPercentage false
</Plugin><Plugin csv>
StoreRates true
</Plugin>LoadPlugin "match_regex"
<Chain "PostCache">
<Rule>
<Match regex>
Plugin "^cpu$"
PluginInstance "^[0-9]+$"
</Match>
<Target write>
Plugin "aggregation"
</Target>
Target stop
</Rule>
<Target write>
Plugin "csv"
</Target>
</Chain>
Replacing
<LoadPlugin CHT_PERFMON_INTERFACE_ENABLED>
Interval CHT_PERFMON_INTERFACE_SAMPLE_INTERVAL
</LoadPlugin>
<LoadPlugin CHT_PERFMON_DISK_ENABLED>
Interval CHT_PERFMON_DISK_SAMPLE_INTERVAL
</LoadPlugin>
with
<LoadPlugin interface>
Interval 10
</LoadPlugin>
<LoadPlugin disk>
Interval 10
</LoadPlugin>
works
Installer fails in ubuntu containers without wget
On line 122 of the install_cht_perfmon.sh should contain an else to handle wget for debian distros.
The redhat yum install is handled but not the debian cases.
118 if ! command_exists wget ; then
119 if command_exists yum ; then
120 echo "** Installing wget first via yum.." 2>&1 | tee -a /tmp/agent_install_log.txt
121 yum install wget
122 fi
123 fi
Should become:
118 if ! command_exists wget ; then
119 if command_exists yum ; then
120 echo "** Installing wget first via yum.." 2>&1 | tee -a /tmp/agent_install_log.txt
121 yum install wget
122 else
123 echo "** Installing wget first via apt-get.." 2>&1 | tee -a /tmp/agent_install_log.txt
124 apt-get install wget -y
125 fi
126 fi
Thanks
Installation error on amazon linux 2 amazon-eks-node instance
Installing via user-data on ami: amazon-eks-node.
I am installing the helper like this:
wget https://s3.amazonaws.com/remote-collector/agent/v24/install_cht_perfmon.sh -O install_cht_perfmon.sh;
sh install_cht_perfmon.sh 24 $KEY aws;
I get the following error in the logs:
Thank you for installing agent!
** Setting up cht_perfmon
/opt/cht_perfmon/embedded/lib/ruby/gems/2.7.0/gems/json-2.5.1/lib/json/common.rb:406:in `generate': "\\x8B" from ASCII-8BIT to UTF-8 (Encoding::UndefinedConversionError)
from /opt/cht_perfmon/embedded/lib/ruby/gems/2.7.0/gems/json-2.5.1/lib/json/common.rb:406:in `pretty_generate'
from /opt/cht_perfmon/embedded/lib/ruby/gems/2.7.0/gems/facter-4.0.51/lib/facter/framework/formatters/json_fact_formatter.rb:25:in `format_for_no_query'
from /opt/cht_perfmon/embedded/lib/ruby/gems/2.7.0/gems/facter-4.0.51/lib/facter/framework/formatters/json_fact_formatter.rb:13:in `format'
from /opt/cht_perfmon/embedded/lib/ruby/gems/2.7.0/gems/facter-4.0.51/lib/facter.rb:439:in `to_user_output'
from /opt/cht_perfmon/embedded/lib/ruby/gems/2.7.0/gems/facter-4.0.51/lib/facter/framework/cli/cli.rb:114:in `query'
from /opt/cht_perfmon/embedded/lib/ruby/gems/2.7.0/gems/thor-1.1.0/lib/thor/command.rb:27:in `run'
from /opt/cht_perfmon/embedded/lib/ruby/gems/2.7.0/gems/thor-1.1.0/lib/thor/invocation.rb:127:in `invoke_command'
from /opt/cht_perfmon/embedded/lib/ruby/gems/2.7.0/gems/thor-1.1.0/lib/thor.rb:392:in `dispatch'
from /opt/cht_perfmon/embedded/lib/ruby/gems/2.7.0/gems/thor-1.1.0/lib/thor/base.rb:485:in `start'
from /opt/cht_perfmon/embedded/lib/ruby/gems/2.7.0/gems/facter-4.0.51/lib/facter/framework/cli/cli_launcher.rb:23:in `start'
from /opt/cht_perfmon/embedded/lib/ruby/gems/2.7.0/gems/facter-4.0.51/bin/facter:10:in `<top (required)>'
from /opt/cht_perfmon/embedded/bin/facter:23:in `load'
from /opt/cht_perfmon/embedded/bin/facter:23:in `<main>'
/opt/cht_perfmon/embedded/lib/ruby/gems/2.7.0/gems/json-2.5.1/lib/json/common.rb:216:in `parse': 809: unexpected token at '' (JSON::ParserError)
from /opt/cht_perfmon/embedded/lib/ruby/gems/2.7.0/gems/json-2.5.1/lib/json/common.rb:216:in `parse'
from /opt/cht_perfmon/embedded/lib/ruby/gems/2.7.0/gems/cht_perfmon-0.0.72/bin/cht_perfmon_installer.rb:40:in `<top (required)>'
from /opt/cht_perfmon/embedded/bin/cht_perfmon_installer.rb:23:in `load'
from /opt/cht_perfmon/embedded/bin/cht_perfmon_installer.rb:23:in `<main>'
** Restarting cht_perfmon
cht_perfmon_collector: warning: no instances running. Starting...
** Restarting collectd with cht_perfmon configs
Could not verify cht_perfmon was running, stopping collectd
IMDsv2 support?
after switching to imdsv2:
** Setting up cht_perfmon
Failed to detect AWS environment via facter, trying with direct access
Failed to find a valid i-xxx AWS identifier for this server.
Use the cloud-health-agent inside a docker container inside an EC2
I wanna run the cloud-health-agent inside a kubernetes cluster on ec2. I'd prefer not having to run it directly on the host.
Here's something i cooked up for it to work:
https://github.com/coveo/docker-cloud-health-agent
There is a bug or limitation in the lib that is used to detect machine information that when you run in docker limits the ability to see ec2 instance metadata. If that could be fixed it would help a lot
Failed to install Cht_agent on Redhat Linux
Agent fills disk with memory-buffered-unanalyzed files
We just had an issue with the CHT agent creating multiple 700+ MB memory-buffered-unanalyzed files starting on 20160928184802
The instance is an AWS Amazon Linux m3.xlarge running redis.
The files are:
[ec2-user@ip- ~]$ ll -ah /var/lib/chtcollectd/i-/memory
total 2.6G
drwxrwsrwx 2 root cht_agent 4.0K Sep 28 19:01 .
drwxrwsrwx 7 root cht_agent 4.0K Dec 22 2015 ..
-rw-rw-r-- 1 root cht_agent 3.5K Sep 28 19:01 memory-buffered-2016-09-28
-rw-rw-r-- 1 cht_agent cht_agent 930M Sep 28 18:45 memory-buffered-unanalyzed20160928184312
-rw-rw-r-- 1 cht_agent cht_agent 898M Sep 28 18:45 memory-buffered-unanalyzed20160928184316
-rw-rw-r-- 1 cht_agent cht_agent 731M Sep 28 18:45 memory-buffered-unanalyzed20160928184338
-rw-r--r-- 1 cht_agent cht_agent 1.8M Sep 28 18:48 memory-buffered-unanalyzed20160928184802
-rw-rw-r-- 1 root cht_agent 3.6K Sep 28 19:01 memory-cached-2016-09-28
-rw-rw-r-- 1 cht_agent cht_agent 8.3K Sep 28 18:42 memory-cached-unanalyzed20160928184247
-rw-rw-r-- 1 root cht_agent 3.6K Sep 28 19:01 memory-free-2016-09-28
-rw-rw-r-- 1 cht_agent cht_agent 8.3K Sep 28 18:42 memory-free-unanalyzed20160928184247
-rw-rw-r-- 1 root cht_agent 3.6K Sep 28 19:01 memory-used-2016-09-28
-rw-rw-r-- 1 cht_agent cht_agent 8.3K Sep 28 18:42 memory-used-unanalyzed20160928184247
ruby perfmon collector does not send metrics
Operating System
Ubuntu Bionic Beaver 18.04
Installation
sh install_cht_perfmon.sh 20 <key> aws disable-update
Result
perfmon collector
I've tapped the instance with mitmproxy to see why metrics are not being sent to https://chapi.cloudhealthtech.com/metrics/v1
(according to the docs)
There are no requests to the endpoint:
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.