Giter VIP home page Giter VIP logo

doctork's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

doctork's Issues

Breaking kafkastats change between 0.2.3 and 0.2.4.2

For a number of clusters, I have dead or super-low-volume topics. With the 0.2.3 kafkastats client, DoctorKafka would display the cluster data correctly. However, with 0.2.4.2, hasFailure is now being set to True when the JMX collector cannot collect, e.g. BytesOutPerSec. In both cases, I get the same log message:

2019-01-10 23:36:09.108 [StatsReporter] WARN  com.pinterest.doctorkafka.stats.BrokerStatsRetriever - Got exception for doctorkafka.operator_report
javax.management.InstanceNotFoundException: kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec,topic=doctorkafka.operator_report
	at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095) ~[?:1.8.0_181]
	at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:643) ~[?:1.8.0_181]
	at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678) ~[?:1.8.0_181]
	at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1445) ~[?:1.8.0_181]
	at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76) ~[?:1.8.0_181]
	at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309) ~[?:1.8.0_181]
	at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1401) ~[?:1.8.0_181]
	at javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:639) ~[?:1.8.0_181]
	at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) ~[?:?]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_181]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_181]
	at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357) ~[?:1.8.0_181]
	at sun.rmi.transport.Transport$1.run(Transport.java:200) ~[?:1.8.0_181]
	at sun.rmi.transport.Transport$1.run(Transport.java:197) ~[?:1.8.0_181]
	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_181]
	at sun.rmi.transport.Transport.serviceCall(Transport.java:196) ~[?:1.8.0_181]
	at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573) ~[?:1.8.0_181]
	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834) ~[?:1.8.0_181]
	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688) ~[?:1.8.0_181]
	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_181]
	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687) ~[?:1.8.0_181]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_181]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_181]
	at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181]
	at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:283) ~[?:1.8.0_181]
	at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:260) ~[?:1.8.0_181]
	at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:161) ~[?:1.8.0_181]
	at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source) ~[?:?]
	at javax.management.remote.rmi.RMIConnectionImpl_Stub.getAttribute(Unknown Source) ~[?:1.8.0_181]
	at javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.getAttribute(RMIConnector.java:903) ~[?:1.8.0_181]
	at com.pinterest.doctorkafka.stats.KafkaMetricRetrievingTask.call(KafkaMetricRetrievingTask.java:30) ~[kafkastats-0.2.4.2-jar-with-dependencies.jar:?]
	at com.pinterest.doctorkafka.stats.KafkaMetricRetrievingTask.call(KafkaMetricRetrievingTask.java:11) ~[kafkastats-0.2.4.2-jar-with-dependencies.jar:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_181]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]

However, the old client sends good data:

{'amiId': 'ami-xxxxxxxx',
 'availabilityZone': 'us-east-1c',
 'cpuUsage': 1.4,
 'failureReason': None,
 'followerReplicas': [{'partition': 13, 'topic': '__consumer_offsets'},
  {'partition': 23, 'topic': '__consumer_offsets'},
  {'partition': 19, 'topic': '__consumer_offsets'},
  {'partition': 17, 'topic': '__consumer_offsets'},
  {'partition': 32, 'topic': '__consumer_offsets'},
  {'partition': 26, 'topic': '__consumer_offsets'},
  {'partition': 7, 'topic': '__consumer_offsets'},
  {'partition': 40, 'topic': '__consumer_offsets'},
  {'partition': 5, 'topic': '__consumer_offsets'},
  {'partition': 3, 'topic': '__consumer_offsets'},
  {'partition': 34, 'topic': '__consumer_offsets'},
  {'partition': 47, 'topic': '__consumer_offsets'},
  {'partition': 16, 'topic': '__consumer_offsets'},
  {'partition': 14, 'topic': '__consumer_offsets'},
  {'partition': 41, 'topic': '__consumer_offsets'},
  {'partition': 10, 'topic': '__consumer_offsets'},
  {'partition': 49, 'topic': '__consumer_offsets'},
  {'partition': 31, 'topic': '__consumer_offsets'},
  {'partition': 29, 'topic': '__consumer_offsets'},
  {'partition': 0, 'topic': 'doctorkafka.operator_report'},
  {'partition': 25, 'topic': '__consumer_offsets'},
  {'partition': 8, 'topic': '__consumer_offsets'},
  {'partition': 35, 'topic': '__consumer_offsets'},
  {'partition': 4, 'topic': '__consumer_offsets'},
  {'partition': 2, 'topic': '__consumer_offsets'}],
 'freeDiskSpaceInBytes': 4291677859840,
 'hasFailure': False,
 'id': 10286,
 'inReassignmentReplicas': [],
 'instanceType': 'm5.large',
 'kafkaVersion': '1.1.1',
 'leaderReplicaStats': [{'bytesIn15MinMeanRate': 78,
   'bytesIn1MinMeanRate': 79,
   'bytesIn5MinMeanRate': 78,
   'bytesOut15MinMeanRate': 888,
   'bytesOut1MinMeanRate': 1517,
   'bytesOut5MinMeanRate': 1863,
   'cpuUsage': 1.4,
   'endOffset': 3778553,
   'inReassignment': False,
   'isLeader': True,
   'logSizeInBytes': 280141429,
   'numLogSegments': 1,
   'partition': 1,
   'startOffset': 3701289,
   'timestamp': 1547162979624,
   'topic': 'doctorkafka.brokerstats',
   'underReplicated': False}],
 'leaderReplicas': [{'partition': 1, 'topic': 'doctorkafka.brokerstats'}],
 'leadersBytesIn15MinRate': 78,
 'leadersBytesIn1MinRate': 79,
 'leadersBytesIn5MinRate': 78,
 'leadersBytesOut15MinRate': 888,
 'leadersBytesOut1MinRate': 1517,
 'leadersBytesOut5MinRate': 1863,
 'logFilesPath': '/mnt/kafka/data',
 'name': 'ip-10-10-2-86',
 'numLeaders': 1,
 'numReplicas': 26,
 'rackId': None,
 'statsVersion': '0.1.15',
 'sysBytesIn1MinRate': 0,
 'sysBytesOut1MinRate': 0,
 'timestamp': 1547162978968,
 'topicsBytesIn15MinRate': {'__consumer_offsets': 0,
  'doctorkafka.brokerstats': 78},
 'topicsBytesIn1MinRate': {'__consumer_offsets': 0,
  'doctorkafka.brokerstats': 79},
 'topicsBytesIn5MinRate': {'__consumer_offsets': 0,
  'doctorkafka.brokerstats': 78},
 'topicsBytesOut15MinRate': {'__consumer_offsets': 0,
  'doctorkafka.brokerstats': 888},
 'topicsBytesOut1MinRate': {'__consumer_offsets': 0,
  'doctorkafka.brokerstats': 1517},
 'topicsBytesOut5MinRate': {'__consumer_offsets': 0,
  'doctorkafka.brokerstats': 1863},
 'totalDiskSpaceInBytes': 4292333535232,
 'zkUrl': '10.10.16.238:2181,10.10.2.10:2181,10.10.6.32:2181'}

while the new kafkastats sends bad data:

{'amiId': 'ami-xxxxxxx',
 'availabilityZone': 'us-east-1c',
 'cpuUsage': 4.0,
 'failureReason': None,
 'followerReplicas': [{'partition': 13, 'topic': '__consumer_offsets'},
  {'partition': 23, 'topic': '__consumer_offsets'},
  {'partition': 19, 'topic': '__consumer_offsets'},
  {'partition': 17, 'topic': '__consumer_offsets'},
  {'partition': 32, 'topic': '__consumer_offsets'},
  {'partition': 26, 'topic': '__consumer_offsets'},
  {'partition': 7, 'topic': '__consumer_offsets'},
  {'partition': 40, 'topic': '__consumer_offsets'},
  {'partition': 5, 'topic': '__consumer_offsets'},
  {'partition': 3, 'topic': '__consumer_offsets'},
  {'partition': 34, 'topic': '__consumer_offsets'},
  {'partition': 47, 'topic': '__consumer_offsets'},
  {'partition': 16, 'topic': '__consumer_offsets'},
  {'partition': 14, 'topic': '__consumer_offsets'},
  {'partition': 41, 'topic': '__consumer_offsets'},
  {'partition': 10, 'topic': '__consumer_offsets'},
  {'partition': 49, 'topic': '__consumer_offsets'},
  {'partition': 31, 'topic': '__consumer_offsets'},
  {'partition': 29, 'topic': '__consumer_offsets'},
  {'partition': 0, 'topic': 'doctorkafka.operator_report'},
  {'partition': 25, 'topic': '__consumer_offsets'},
  {'partition': 8, 'topic': '__consumer_offsets'},
  {'partition': 35, 'topic': '__consumer_offsets'},
  {'partition': 4, 'topic': '__consumer_offsets'},
  {'partition': 2, 'topic': '__consumer_offsets'}],
 'freeDiskSpaceInBytes': 4291677859840,
 'hasFailure': True,
 'id': 10286,
 'inReassignmentReplicas': [],
 'instanceType': 'm5.large',
 'kafkaVersion': '1.1.1',
 'leaderReplicaStats': [{'bytesIn15MinMeanRate': 78,
   'bytesIn1MinMeanRate': 81,
   'bytesIn5MinMeanRate': 79,
   'bytesOut15MinMeanRate': 1013,
   'bytesOut1MinMeanRate': 13533,
   'bytesOut5MinMeanRate': 2859,
   'cpuUsage': 4.0,
   'endOffset': 3778503,
   'inReassignment': False,
   'isLeader': True,
   'logSizeInBytes': 280130728,
   'numLogSegments': 1,
   'partition': 1,
   'startOffset': 3701289,
   'timestamp': 1547162844124,
   'topic': 'doctorkafka.brokerstats',
   'underReplicated': False}],
 'leaderReplicas': [{'partition': 1, 'topic': 'doctorkafka.brokerstats'}],
 'leadersBytesIn15MinRate': 78,
 'leadersBytesIn1MinRate': 81,
 'leadersBytesIn5MinRate': 79,
 'leadersBytesOut15MinRate': 1013,
 'leadersBytesOut1MinRate': 13533,
 'leadersBytesOut5MinRate': 2859,
 'logFilesPath': '/mnt/kafka/data',
 'name': 'ip-10-10-2-86',
 'numLeaders': 1,
 'numReplicas': 26,
 'rackId': None,
 'statsVersion': '0.1.15',
 'sysBytesIn1MinRate': 0,
 'sysBytesOut1MinRate': 0,
 'timestamp': 1547162843992,
 'topicsBytesIn15MinRate': {'__consumer_offsets': 0,
  'doctorkafka.brokerstats': 78},
 'topicsBytesIn1MinRate': {'__consumer_offsets': 0,
  'doctorkafka.brokerstats': 81},
 'topicsBytesIn5MinRate': {'__consumer_offsets': 0,
  'doctorkafka.brokerstats': 79},
 'topicsBytesOut15MinRate': {'__consumer_offsets': 0,
  'doctorkafka.brokerstats': 1013},
 'topicsBytesOut1MinRate': {'__consumer_offsets': 0,
  'doctorkafka.brokerstats': 13533},
 'topicsBytesOut5MinRate': {'__consumer_offsets': 0,
  'doctorkafka.brokerstats': 2859},
 'totalDiskSpaceInBytes': 4292333535232,
 'zkUrl': '10.10.16.238:2181,10.10.2.10:2181,10.10.6.32:2181'}

Error in starting DoctorKafkaMain 0.2.4.3

Did anyone faced this issue.

I am using kafka_2.11-1.1.0 and doctorkafka 0.2.4.3 to run doctorkafka.

Running the process:
java -server
-cp libs/*:doctorkafka-0.2.4.3-jar-with-dependencies.jar
com.pinterest.doctorkafka.DoctorKafkaMain
server /Users/venkat/doctorkafka/drkafka/config/doctorkafka.dev.yaml

Console Log:
ERROR StatusLogger No Log4j 2 configuration file found. Using default configuration (logging only errors to the console), or user programmatically provided configurations. Set system property 'log4j2.debug' to show Log4j 2 internal initialization logging. See https://logging.apache.org/log4j/2.x/manual/configuration.html for instructions on how to configure Log4j 2
INFO [2019-01-30 10:22:07,989] io.dropwizard.server.DefaultServerFactory: Registering jersey handler with root path prefix: /
INFO [2019-01-30 10:22:07,991] io.dropwizard.server.DefaultServerFactory: Registering admin handler with root path prefix: /
INFO [2019-01-30 10:22:07,991] io.dropwizard.assets.AssetsBundle: Registering AssetBundle with name: assets for path /*
INFO [2019-01-30 10:22:08,113] com.pinterest.doctorkafka.util.ZookeeperClient: Initialize curator with zkurl:localhost:2181
INFO [2019-01-30 10:22:08,146] org.apache.curator.utils.Compatibility: Running in ZooKeeper 3.4.x compatibility mode
INFO [2019-01-30 10:22:08,162] org.apache.curator.framework.imps.CuratorFrameworkImpl: Starting
INFO [2019-01-30 10:22:08,226] org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
INFO [2019-01-30 10:22:08,226] org.apache.zookeeper.ZooKeeper: Client environment:host.name=10.50.1.202
INFO [2019-01-30 10:22:08,226] org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.8.0_191
INFO [2019-01-30 10:22:08,226] org.apache.zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
INFO [2019-01-30 10:22:08,226] org.apache.zookeeper.ZooKeeper: Client environment:java.home=/Library/Java/JavaVirtualMachines/jdk1.8.0_191.jdk/Contents/Home/jre
INFO [2019-01-30 10:22:08,226] org.apache.zookeeper.ZooKeeper: Client environment:java.class.path=libs/kafkastats-0.2.4.3-jar-with-dependencies.jar:libs/doctorkafka-0.2.4.3-jar-with-dependencies.jar:libs/doctorkafka-0.2.4.3.jar:libs/kafkastats-0.2.4.3.jar:doctorkafka-0.2.4.3-jar-with-dependencies.jar
INFO [2019-01-30 10:22:08,226] org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/Users/venkat/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:.
INFO [2019-01-30 10:22:08,226] org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/var/folders/28/qrgydfnx1jv4cl703nzp3lhc92z1zs/T/
INFO [2019-01-30 10:22:08,226] org.apache.zookeeper.ZooKeeper: Client environment:java.compiler=
INFO [2019-01-30 10:22:08,226] org.apache.zookeeper.ZooKeeper: Client environment:os.name=Mac OS X
INFO [2019-01-30 10:22:08,226] org.apache.zookeeper.ZooKeeper: Client environment:os.arch=x86_64
INFO [2019-01-30 10:22:08,226] org.apache.zookeeper.ZooKeeper: Client environment:os.version=10.13.6
INFO [2019-01-30 10:22:08,226] org.apache.zookeeper.ZooKeeper: Client environment:user.name=venkat
INFO [2019-01-30 10:22:08,226] org.apache.zookeeper.ZooKeeper: Client environment:user.home=/Users/venkat
INFO [2019-01-30 10:22:08,226] org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/Users/venkat/softwares/doctorkafka
INFO [2019-01-30 10:22:08,227] org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=60000 watcher=org.apache.curator.ConnectionState@1352434e
INFO [2019-01-30 10:22:08,244] org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
INFO [2019-01-30 10:22:08,245] org.apache.curator.framework.imps.CuratorFrameworkImpl: Default schema
INFO [2019-01-30 10:22:08,272] org.apache.zookeeper.ClientCnxn: Socket connection established to localhost/127.0.0.1:2181, initiating session
INFO [2019-01-30 10:22:08,281] org.apache.zookeeper.ClientCnxn: Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x1689a86df6e004d, negotiated timeout = 60000
INFO [2019-01-30 10:22:08,288] org.apache.curator.framework.state.ConnectionStateManager: State change: CONNECTED
INFO [2019-01-30 10:22:08,515] org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181/cluster1 sessionTimeout=30000 watcher=org.I0Itec.zkclient.ZkClient@69d787b5
INFO [2019-01-30 10:22:08,515] org.I0Itec.zkclient.ZkEventThread: Starting ZkClient event thread.
INFO [2019-01-30 10:22:08,516] org.I0Itec.zkclient.ZkClient: Waiting for keeper state SyncConnected
INFO [2019-01-30 10:22:08,517] org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
INFO [2019-01-30 10:22:08,517] org.apache.zookeeper.ClientCnxn: Socket connection established to localhost/127.0.0.1:2181, initiating session
INFO [2019-01-30 10:22:08,519] org.apache.zookeeper.ClientCnxn: Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x1689a86df6e004e, negotiated timeout = 30000
INFO [2019-01-30 10:22:08,519] org.I0Itec.zkclient.ZkClient: zookeeper state changed (SyncConnected)
INFO [2019-01-30 10:22:08,530] kafka.utils.Log4jControllerRegistration$: Registered kafka:type=kafka.Log4jController MBean
15:52:08.598 [pool-6-thread-1] ERROR com.pinterest.doctorkafka.DoctorKafkaMain - DoctorKafka start failed
java.lang.NullPointerException: null
at java.util.Hashtable.put(Hashtable.java:460) ~[?:1.8.0_191]
at com.pinterest.doctorkafka.util.KafkaUtils.getKafkaConsumer(KafkaUtils.java:101) ~[kafkastats-0.2.4.3-jar-with-dependencies.jar:?]

at com.pinterest.doctorkafka.replicastats.ReplicaStatsManager.readPastReplicaStats(ReplicaStatsManager.java:208) ~[doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
at com.pinterest.doctorkafka.DoctorKafka.start(DoctorKafka.java:51) ~[doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
at com.pinterest.doctorkafka.DoctorKafkaMain.lambda$run$0(DoctorKafkaMain.java:68) ~[doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_191]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_191]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_191]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_191]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]
INFO [2019-01-30 10:22:08,650] com.twitter.ostrich.stats.LatchedStatsListener$$anon$1: Starting LatchedStatsListener
INFO [2019-01-30 10:22:08,831] com.twitter.ostrich.admin.AdminHttpService: Admin HTTP interface started on port 2052.
INFO [2019-01-30 10:22:08,888] io.dropwizard.server.ServerFactory: Starting DoctorKafkaMain
INFO [2019-01-30 10:22:08,937] org.eclipse.jetty.setuid.SetUIDListener: Opened application@2bfaba70{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
INFO [2019-01-30 10:22:08,938] org.eclipse.jetty.server.Server: jetty-9.4.z-SNAPSHOT; built: 2018-11-14T21:20:31.478Z; git: c4550056e785fb5665914545889f21dc136ad9e6; jvm 1.8.0_191-b12
INFO [2019-01-30 10:22:09,309] io.dropwizard.jersey.DropwizardResourceConfig: The following paths were found for the configured resources:

GET     /api/cluster (com.pinterest.doctorkafka.api.ClusterApi)
DELETE  /api/cluster/{clusterName}/admin/maintenance (com.pinterest.doctorkafka.api.MaintenanceApi)
GET     /api/cluster/{clusterName}/admin/maintenance (com.pinterest.doctorkafka.api.MaintenanceApi)
PUT     /api/cluster/{clusterName}/admin/maintenance (com.pinterest.doctorkafka.api.MaintenanceApi)
GET     /api/cluster/{clusterName}/broker (com.pinterest.doctorkafka.api.BrokerApi)

INFO [2019-01-30 10:22:09,311] org.eclipse.jetty.server.handler.ContextHandler: Started i.d.j.MutableServletContextHandler@2faa55bb{/,null,AVAILABLE}
INFO [2019-01-30 10:22:09,320] org.eclipse.jetty.server.AbstractConnector: Started application@2bfaba70{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
INFO [2019-01-30 10:22:09,320] org.eclipse.jetty.server.Server: Started @2696ms
WARN [2019-01-30 10:22:18,923] com.pinterest.doctorkafka.util.MetricsPusher: Failed to send stats to OpenTSDB, will retry up to next interval
! java.net.ConnectException: Connection refused (Connection refused)
! at java.net.PlainSocketImpl.socketConnect(Native Method)
! at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
! at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
! at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
! at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
! at java.net.Socket.connect(Socket.java:589)
! at com.pinterest.doctorkafka.util.OpenTsdbClient.sendMetrics(OpenTsdbClient.java:107)
! ... 2 common frames omitted
! Causing: com.pinterest.doctorkafka.util.OpenTsdbClient$ConnectionFailedException: java.net.ConnectException: Connection refused (Connection refused)

! at com.pinterest.doctorkafka.util.OpenTsdbClient.sendMetrics(OpenTsdbClient.java:109)
! at com.pinterest.doctorkafka.util.MetricsPusher.sendMetrics(MetricsPusher.java:101)
! at com.pinterest.doctorkafka.util.MetricsPusher.run(MetricsPusher.java:129)
^C15:52:20.336 [Thread-2] ERROR com.pinterest.doctorkafka.DoctorKafkaMain - Failure in stopping operator
java.lang.NullPointerException: null
at com.pinterest.doctorkafka.DoctorKafka.stop(DoctorKafka.java:78) ~[doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
at com.pinterest.doctorkafka.DoctorKafkaMain$OperatorCleanupThread.run(DoctorKafkaMain.java:153) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
INFO [2019-01-30 10:22:20,342] org.eclipse.jetty.server.AbstractConnector: Stopped application@2bfaba70{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
INFO [2019-01-30 10:22:20,345] org.eclipse.jetty.server.handler.ContextHandler: Stopped i.d.j.MutableServletContextHandler@2faa55bb{/,null,UNAVAILABLE}

0.2.4.1 isn't up on mavencentral

Dependencies were updated to 0.2.4.1, but nothing new has been pushed to mavencentral. Is there an ETA on when that build will be available?

Not able to start kafkastats

We're getting this error when starting kafkastats:

Unrecognized option: -jmxport=9999
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

labels and metrics missing in stats.txt

curl localhost:2051/stats.txt

counters:
jvm_gc_G1_Old_Generation_cycles: 0
jvm_gc_G1_Old_Generation_msec: 0
jvm_gc_G1_Young_Generation_cycles: 4
jvm_gc_G1_Young_Generation_msec: 22
jvm_gc_cycles: 4
jvm_gc_msec: 22
kafka.stats.collector.success host=ATH021430: 3
gauges:
jvm_buffer_direct_count: 7
jvm_buffer_direct_max: 11238
jvm_buffer_direct_used: 11238
jvm_buffer_mapped_count: 0
jvm_buffer_mapped_max: 0
jvm_buffer_mapped_used: 0
jvm_classes_current_loaded: 5646
jvm_classes_total_loaded: 5646
jvm_classes_total_unloaded: 0
jvm_compilation_time_msec: 4092
jvm_current_mem_CodeHeap__non_nmethods__max: 5836800
jvm_current_mem_CodeHeap__non_nmethods__used: 1295104
jvm_current_mem_CodeHeap__non_profiled_nmethods__max: 122912768
jvm_current_mem_CodeHeap__non_profiled_nmethods__used: 1245952
jvm_current_mem_CodeHeap__profiled_nmethods__max: 122908672
jvm_current_mem_CodeHeap__profiled_nmethods__used: 5857920
jvm_current_mem_Compressed_Class_Space_max: 1073741824
jvm_current_mem_Compressed_Class_Space_used: 4882000
jvm_current_mem_G1_Eden_Space_max: -1
jvm_current_mem_G1_Eden_Space_used: 45088768
jvm_current_mem_G1_Old_Gen_max: 4294967296
jvm_current_mem_G1_Old_Gen_used: 5705664
jvm_current_mem_G1_Survivor_Space_max: -1
jvm_current_mem_G1_Survivor_Space_used: 4194304
jvm_current_mem_Metaspace_max: -1
jvm_current_mem_Metaspace_used: 38412688
jvm_current_mem_used: 106682400
jvm_fd_count: 28
jvm_fd_limit: 10240
jvm_heap_committed: 268435456
jvm_heap_max: 4294967296
jvm_heap_used: 53940160
jvm_nonheap_committed: 56143872
jvm_nonheap_max: -1
jvm_nonheap_used: 51693664
jvm_num_cpus: 8
jvm_post_gc_G1_Eden_Space_max: -1
jvm_post_gc_G1_Eden_Space_used: 0
jvm_post_gc_G1_Old_Gen_max: 4294967296
jvm_post_gc_G1_Old_Gen_used: 0
jvm_post_gc_G1_Survivor_Space_max: -1
jvm_post_gc_G1_Survivor_Space_used: 4194304
jvm_post_gc_used: 4194304
jvm_start_time: 1547650392880
jvm_thread_count: 63
jvm_thread_daemon_count: 18
jvm_thread_peak_count: 63
jvm_uptime: 165103
labels:
metrics:

There is exception reported by kafkastats process.

DoctorKafka fails to start NullPointerException at getProcessingStartOffsets

Hi

I got DoctorKafka to run previously with one broker on my local machine. I added two other brokers on same IP different ports. (Total kafka cluster with 1 zk node and 3 brokers). Doctor Kafka fails to start with below.

12:44:49.344 [pool-8-thread-1] ERROR com.pinterest.doctorkafka.DoctorKafkaMain - DoctorKafka start failed
java.lang.NullPointerException: null
        at com.pinterest.doctorkafka.util.ReplicaStatsUtil.getProcessingStartOffsets(ReplicaStatsUtil.java:28) ~[doctorkafka-0.2.4.9.jar:?]
        at com.pinterest.doctorkafka.replicastats.ReplicaStatsManager.readPastReplicaStats(ReplicaStatsManager.java:82) ~[doctorkafka-0.2.4.9.jar:?]
        at com.pinterest.doctorkafka.DoctorKafka.start(DoctorKafka.java:56) ~[doctorkafka-0.2.4.9.jar:?]
        at com.pinterest.doctorkafka.DoctorKafkaMain.lambda$run$0(DoctorKafkaMain.java:78) ~[doctorkafka-0.2.4.9.jar:?]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_252]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_252]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_252]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_252]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
INFO  [2020-05-29 19:44:49,356] org.eclipse.jetty.setuid.SetUIDListener: Opened application@359066bc{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
INFO  [2020-05-29 19:44:49,359] org.eclipse.jetty.server.Server: jetty-9.4.18.v20190429; built: 2019-04-29T20:42:08.989Z; git: e1bc35120a6617ee3df052294e433f3a25ce7097; jvm 1.8.0_252-8u252-b09-1~18.04-b09
INFO  [2020-05-29 19:44:49,812] io.dropwizard.jersey.DropwizardResourceConfig: The following paths were found for the configured resources:

Kafkastats fails to report stats due to ArithmeticError

I'm unable to get the kafkastats service to report metrics. On each report interval I see the following error:

21:25:49.144 [StatsReporter] ERROR com.pinterest.doctorkafka.stats.BrokerStatsReporter - Failed to report stats
java.lang.ArithmeticException: / by zero
        at com.pinterest.doctorkafka.stats.BrokerStatsRetriever.computeNetworkStats(BrokerStatsRetriever.java:676) ~[kafkastats-0.2.4.2.jar:?]
        at com.pinterest.doctorkafka.stats.BrokerStatsRetriever.retrieveBrokerStats(BrokerStatsRetriever.java:656) ~[kafkastats-0.2.4.2.jar:?]
        at com.pinterest.doctorkafka.stats.BrokerStatsReporter.run(BrokerStatsReporter.java:57) [kafkastats-0.2.4.2.jar:?]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_171]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_171]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_171]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_171]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_171]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_171]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]

This appears due to the value of deltaT always being set to 0 in this method: https://github.com/pinterest/doctorkafka/blob/master/kafkastats/src/main/java/com/pinterest/doctorkafka/stats/BrokerStatsRetriever.java#L668

I've tried running kafkastats with both the 0.2.4.3 release and the 0.2.4.2 release and have found this problem presents in both.

DrKafkaStart failed, NullPointerException getProcessingStartOffsets

java -server -cp lib/:doctorkafka-0.2.4.3.jar com.pinterest.doctorkafka.DoctorKafkaMain server dropwizard.yml
ERROR StatusLogger No Log4j 2 configuration file found. Using default configuration (logging only errors to the console), or user programmatically provided configurations. Set system property 'log4j2.debug' to show Log4j 2 internal initialization logging. See https://logging.apache.org/log4j/2.x/manual/configuration.html for instructions on how to configure Log4j 2
INFO [2019-01-17 21:15:44,236] io.dropwizard.server.DefaultServerFactory: Registering jersey handler with root path prefix: /
INFO [2019-01-17 21:15:44,239] io.dropwizard.server.DefaultServerFactory: Registering admin handler with root path prefix: /
INFO [2019-01-17 21:15:44,239] io.dropwizard.assets.AssetsBundle: Registering AssetBundle with name: assets for path /

INFO [2019-01-17 21:15:44,416] com.pinterest.doctorkafka.util.ZookeeperClient: Initialize curator with zkurl:df_zookeeper_dfops_743_1_development_datafabric_kafka_aws_us_ea.aws.athenahealth.com:2181/
INFO [2019-01-17 21:15:44,487] org.apache.curator.utils.Compatibility: Running in ZooKeeper 3.4.x compatibility mode
INFO [2019-01-17 21:15:44,513] org.apache.curator.framework.imps.CuratorFrameworkImpl: Starting
INFO [2019-01-17 21:15:44,524] org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
INFO [2019-01-17 21:15:44,524] org.apache.zookeeper.ZooKeeper: Client environment:host.name=ip-10-129-102-236.us-east-1.aws.athenahealth.com
INFO [2019-01-17 21:15:44,524] org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.8.0_191
INFO [2019-01-17 21:15:44,524] org.apache.zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
INFO [2019-01-17 21:15:44,524] org.apache.zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.191.b12-1.el7_6.x86_64/jre
INFO [2019-01-17 21:15:44,524] org.apache.zookeeper.ZooKeeper: Client environment:java.class.path=lib/kafkastats-0.2.4.3.jar:lib/commons-cli-1.3.1.jar:lib/avro-1.8.2.jar:lib/jackson-core-asl-1.9.13.jar:lib/jackson-mapper-asl-1.9.13.jar:lib/paranamer-2.7.jar:lib/snappy-java-1.1.1.3.jar:lib/commons-compress-1.8.1.jar:lib/xz-1.5.jar:lib/slf4j-api-1.7.25.jar:lib/kafka_2.12-1.1.0.jar:lib/kafka-clients-1.1.0.jar:lib/lz4-java-1.4.jar:lib/jackson-databind-2.9.8.jar:lib/jackson-annotations-2.9.8.jar:lib/jackson-core-2.9.8.jar:lib/jopt-simple-5.0.4.jar:lib/metrics-core-2.2.0.jar:lib/scala-library-2.12.7.jar:lib/scala-reflect-2.12.4.jar:lib/scala-logging_2.12-3.7.2.jar:lib/zkclient-0.10.jar:lib/zookeeper-3.4.10.jar:lib/log4j-1.2.16.jar:lib/jline-0.9.94.jar:lib/junit-3.8.1.jar:lib/netty-3.10.5.Final.jar:lib/log4j-core-2.11.1.jar:lib/log4j-api-2.11.1.jar:lib/ostrich_2.12-9.27.0.jar:lib/util-core_2.12-6.43.0.jar:lib/util-function_2.12-6.43.0.jar:lib/scala-parser-combinators_2.12-1.0.4.jar:lib/util-eval_2.12-6.43.0.jar:lib/scala-compiler-2.12.1.jar:lib/scala-xml_2.12-1.0.6.jar:lib/util-logging_2.12-6.43.0.jar:lib/util-app_2.12-6.43.0.jar:lib/util-registry_2.12-6.43.0.jar:lib/util-stats_2.12-6.43.0.jar:lib/util-lint_2.12-6.43.0.jar:lib/caffeine-2.3.4.jar:lib/jsr305-1.3.9.jar:lib/util-jvm_2.12-6.43.0.jar:lib/jackson-module-scala_2.12-2.9.7.jar:lib/jackson-module-paranamer-2.9.7.jar:lib/guava-23.0.jar:lib/error_prone_annotations-2.0.18.jar:lib/j2objc-annotations-1.1.jar:lib/animal-sniffer-annotations-1.14.jar:lib/commons-lang3-3.6.jar:lib/commons-configuration2-2.4.jar:lib/commons-text-1.6.jar:lib/commons-logging-1.2.jar:lib/commons-math3-3.6.1.jar:lib/commons-beanutils-1.9.3.jar:lib/commons-collections-3.2.2.jar:lib/commons-validator-1.6.jar:lib/commons-digester-1.8.1.jar:lib/curator-framework-4.0.1.jar:lib/curator-client-4.0.1.jar:lib/metrics-core-3.2.3.jar:lib/gson-2.8.2.jar:lib/javax.annotation-api-1.3.2.jar:lib/jaxws-api-2.3.1.jar:lib/jaxb-api-2.3.1.jar:lib/javax.activation-api-1.2.0.jar:lib/javax.xml.soap-api-1.4.0.jar:lib/dropwizard-assets-1.3.8.jar:lib/dropwizard-core-1.3.8.jar:lib/dropwizard-util-1.3.8.jar:lib/joda-time-2.9.9.jar:lib/dropwizard-jackson-1.3.8.jar:lib/jackson-datatype-guava-2.9.6.jar:lib/jackson-datatype-jsr310-2.9.6.jar:lib/jackson-datatype-jdk8-2.9.6.jar:lib/jackson-module-parameter-names-2.9.6.jar:lib/jackson-module-afterburner-2.9.6.jar:lib/jackson-datatype-joda-2.9.6.jar:lib/dropwizard-validation-1.3.8.jar:lib/hibernate-validator-5.4.2.Final.jar:lib/validation-api-1.1.0.Final.jar:lib/jboss-logging-3.3.0.Final.jar:lib/classmate-1.3.1.jar:lib/javax.el-3.0.0.jar:lib/javassist-3.22.0-GA.jar:lib/dropwizard-configuration-1.3.8.jar:lib/jackson-dataformat-yaml-2.9.6.jar:lib/snakeyaml-1.18.jar:lib/dropwizard-logging-1.3.8.jar:lib/metrics-logback-4.0.2.jar:lib/jul-to-slf4j-1.7.25.jar:lib/logback-core-1.2.3.jar:lib/logback-classic-1.2.3.jar:lib/jcl-over-slf4j-1.7.25.jar:lib/jetty-util-9.4.14.v20181114.jar:lib/dropwizard-metrics-1.3.8.jar:lib/dropwizard-lifecycle-1.3.8.jar:lib/javax.servlet-api-3.1.0.jar:lib/jetty-server-9.4.14.v20181114.jar:lib/jetty-http-9.4.14.v20181114.jar:lib/jetty-io-9.4.14.v20181114.jar:lib/dropwizard-jersey-1.3.8.jar:lib/jersey-server-2.25.1.jar:lib/jersey-common-2.25.1.jar:lib/javax.ws.rs-api-2.0.1.jar:lib/jersey-guava-2.25.1.jar:lib/hk2-api-2.5.0-b32.jar:lib/hk2-utils-2.5.0-b32.jar:lib/aopalliance-repackaged-2.5.0-b32.jar:lib/javax.inject-2.5.0-b32.jar:lib/hk2-locator-2.5.0-b32.jar:lib/osgi-resource-locator-1.0.1.jar:lib/jersey-client-2.25.1.jar:lib/jersey-media-jaxb-2.25.1.jar:lib/jersey-metainf-services-2.25.1.jar:lib/jersey-bean-validation-2.25.1.jar:lib/metrics-jersey2-4.0.2.jar:lib/metrics-annotation-4.0.2.jar:lib/jackson-jaxrs-json-provider-2.9.6.jar:lib/jackson-jaxrs-base-2.9.6.jar:lib/jackson-module-jaxb-annotations-2.9.6.jar:lib/jersey-container-servlet-2.25.1.jar:lib/jersey-container-servlet-core-2.25.1.jar:lib/jetty-webapp-9.4.14.v20181114.jar:lib/jetty-xml-9.4.14.v20181114.jar:lib/jetty-servlet-9.4.14.v20181114.jar:lib/jetty-security-9.4.14.v20181114.jar:lib/jetty-continuation-9.4.14.v20181114.jar:lib/dropwizard-servlets-1.3.8.jar:lib/dropwizard-jetty-1.3.8.jar:lib/metrics-jetty9-4.0.2.jar:lib/jetty-servlets-9.4.14.v20181114.jar:lib/metrics-jvm-4.0.2.jar:lib/metrics-jmx-4.0.2.jar:lib/metrics-servlets-4.0.2.jar:lib/metrics-healthchecks-4.0.2.jar:lib/metrics-json-4.0.2.jar:lib/profiler-1.0.2.jar:lib/dropwizard-request-logging-1.3.8.jar:lib/logback-access-1.2.3.jar:lib/argparse4j-0.8.1.jar:lib/jetty-setuid-java-1.0.3.jar:lib/dropwizard-auth-1.3.8.jar:doctorkafka-0.2.4.3.jar
INFO [2019-01-17 21:15:44,524] org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
INFO [2019-01-17 21:15:44,524] org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
INFO [2019-01-17 21:15:44,524] org.apache.zookeeper.ZooKeeper: Client environment:java.compiler=
INFO [2019-01-17 21:15:44,524] org.apache.zookeeper.ZooKeeper: Client environment:os.name=Linux
INFO [2019-01-17 21:15:44,524] org.apache.zookeeper.ZooKeeper: Client environment:os.arch=amd64
INFO [2019-01-17 21:15:44,524] org.apache.zookeeper.ZooKeeper: Client environment:os.version=3.10.0-862.3.2.el7.x86_64
INFO [2019-01-17 21:15:44,524] org.apache.zookeeper.ZooKeeper: Client environment:user.name=yuzhao
INFO [2019-01-17 21:15:44,524] org.apache.zookeeper.ZooKeeper: Client environment:user.home=/home/yuzhao
INFO [2019-01-17 21:15:44,524] org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/home/yuzhao
INFO [2019-01-17 21:15:44,525] org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=df_zookeeper_dfops_743_1_development_datafabric_kafka_aws_us_ea.aws.athenahealth.com:2181/ sessionTimeout=60000 watcher=org.apache.curator.ConnectionState@77681ce4
INFO [2019-01-17 21:15:44,619] org.apache.zookeeper.ClientCnxn: Opening socket connection to server 10.129.107.71/10.129.107.71:2181. Will not attempt to authenticate using SASL (unknown error)
INFO [2019-01-17 21:15:44,627] org.apache.curator.framework.imps.CuratorFrameworkImpl: Default schema
INFO [2019-01-17 21:15:44,636] org.apache.zookeeper.ClientCnxn: Socket connection established to 10.129.107.71/10.129.107.71:2181, initiating session
INFO [2019-01-17 21:15:44,648] io.dropwizard.server.ServerFactory: Starting DoctorKafkaMain
INFO [2019-01-17 21:15:44,650] org.apache.zookeeper.ClientCnxn: Session establishment complete on server 10.129.107.71/10.129.107.71:2181, sessionid = 0x1684d33f9000020, negotiated timeout = 40000
INFO [2019-01-17 21:15:44,669] org.apache.curator.framework.state.ConnectionStateManager: State change: CONNECTED
INFO [2019-01-17 21:15:44,845] org.eclipse.jetty.setuid.SetUIDListener: Opened application@6f044c58{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
INFO [2019-01-17 21:15:44,847] org.eclipse.jetty.server.Server: jetty-9.4.14.v20181114; built: 2018-11-14T21:20:31.478Z; git: c4550056e785fb5665914545889f21dc136ad9e6; jvm 1.8.0_191-b12
INFO [2019-01-17 21:15:45,021] org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=df_zookeeper_dfops_743_1_development_datafabric_kafka_aws_us_ea.aws.athenahealth.com:2181/zookeeper_datafabric_development_dfops_743_1_kafka_aws_us_east_1_df_zookeeper sessionTimeout=30000 watcher=org.I0Itec.zkclient.ZkClient@277a8c9
INFO [2019-01-17 21:15:45,022] org.I0Itec.zkclient.ZkEventThread: Starting ZkClient event thread.
INFO [2019-01-17 21:15:45,024] org.I0Itec.zkclient.ZkClient: Waiting for keeper state SyncConnected
INFO [2019-01-17 21:15:45,025] org.apache.zookeeper.ClientCnxn: Opening socket connection to server 10.129.107.71/10.129.107.71:2181. Will not attempt to authenticate using SASL (unknown error)
INFO [2019-01-17 21:15:45,027] org.apache.zookeeper.ClientCnxn: Socket connection established to 10.129.107.71/10.129.107.71:2181, initiating session
INFO [2019-01-17 21:15:45,033] org.apache.zookeeper.ClientCnxn: Session establishment complete on server 10.129.107.71/10.129.107.71:2181, sessionid = 0x1684d33f9000021, negotiated timeout = 30000
INFO [2019-01-17 21:15:45,040] org.I0Itec.zkclient.ZkClient: zookeeper state changed (SyncConnected)
INFO [2019-01-17 21:15:45,055] kafka.utils.Log4jControllerRegistration$: Registered kafka:type=kafka.Log4jController MBean
INFO [2019-01-17 21:15:45,419] org.apache.kafka.clients.consumer.ConsumerConfig: ConsumerConfig values:
auto.commit.interval.ms = 5000
auto.offset.reset = latest
bootstrap.servers = [10.129.105.28:9092, 10.129.104.201:9092, 10.129.103.217:9092]
check.crcs = true
client.id =
connections.max.idle.ms = 540000
enable.auto.commit = false
exclude.internal.topics = true
fetch.max.bytes = 52428800
fetch.max.wait.ms = 500
fetch.min.bytes = 1
group.id = doctorkafka
heartbeat.interval.ms = 3000
interceptor.classes = []
internal.leave.group.on.close = true
isolation.level = read_uncommitted
key.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
max.partition.fetch.bytes = 4194304
max.poll.interval.ms = 1
max.poll.records = 500
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
receive.buffer.bytes = 65536
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 305000
retry.backoff.ms = 100
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
session.timeout.ms = 10000
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = null
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
value.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer

INFO [2019-01-17 21:15:45,486] org.apache.kafka.common.utils.AppInfoParser: Kafka version : 1.1.0
INFO [2019-01-17 21:15:45,486] org.apache.kafka.common.utils.AppInfoParser: Kafka commitId : fdcf75ea326b8e07
INFO [2019-01-17 21:15:45,625] org.apache.kafka.clients.Metadata: Cluster ID: M4N2fNprQYKCrKhKzAH4zQ
21:15:45.676 [pool-6-thread-1] ERROR com.pinterest.doctorkafka.DoctorKafkaMain - DoctorKafka start failed
java.lang.NullPointerException: null
at com.pinterest.doctorkafka.replicastats.ReplicaStatsManager.getProcessingStartOffsets(ReplicaStatsManager.java:193) ~[doctorkafka-0.2.4.3.jar:?]
at com.pinterest.doctorkafka.replicastats.ReplicaStatsManager.readPastReplicaStats(ReplicaStatsManager.java:216) ~[doctorkafka-0.2.4.3.jar:?]
at com.pinterest.doctorkafka.DoctorKafka.start(DoctorKafka.java:51) ~[doctorkafka-0.2.4.3.jar:?]
at com.pinterest.doctorkafka.DoctorKafkaMain.lambda$run$0(DoctorKafkaMain.java:68) ~[doctorkafka-0.2.4.3.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_191]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_191]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_191]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_191]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]
INFO [2019-01-17 21:15:45,782] io.dropwizard.jersey.DropwizardResourceConfig: The following paths were found for the configured resources:

GET     /api/cluster (com.pinterest.doctorkafka.api.ClusterApi)
DELETE  /api/cluster/{clusterName}/admin/maintenance (com.pinterest.doctorkafka.api.MaintenanceApi)
GET     /api/cluster/{clusterName}/admin/maintenance (com.pinterest.doctorkafka.api.MaintenanceApi)
PUT     /api/cluster/{clusterName}/admin/maintenance (com.pinterest.doctorkafka.api.MaintenanceApi)
GET     /api/cluster/{clusterName}/broker (com.pinterest.doctorkafka.api.BrokerApi)

INFO [2019-01-17 21:15:45,784] org.eclipse.jetty.server.handler.ContextHandler: Started i.d.j.MutableServletContextHandler@1961d75a{/,null,AVAILABLE}
INFO [2019-01-17 21:15:45,797] org.eclipse.jetty.server.AbstractConnector: Started application@6f044c58{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
INFO [2019-01-17 21:15:45,797] org.eclipse.jetty.server.Server: Started @3572ms

^C
21:17:16.003 [Thread-2] ERROR com.pinterest.doctorkafka.DoctorKafkaMain - Failure in stopping operator
java.lang.NullPointerException: null
at com.pinterest.doctorkafka.DoctorKafka.stop(DoctorKafka.java:78) ~[doctorkafka-0.2.4.3.jar:?]
at com.pinterest.doctorkafka.DoctorKafkaMain$OperatorCleanupThread.run(DoctorKafkaMain.java:153) [doctorkafka-0.2.4.3.jar:?]
INFO [2019-01-17 21:17:16,006] org.eclipse.jetty.server.AbstractConnector: Stopped application@6f044c58{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
INFO [2019-01-17 21:17:16,012] org.eclipse.jetty.server.handler.ContextHandler: Stopped i.d.j.MutableServletContextHandler@1961d75a{/,null,UNAVAILABLE}

Cannot get dropwizard server running

java -server -cp lib/*:doctorkafka-0.2.4.jar com.pinterest.doctorkafka.DoctorKafkaMain server dropwizard.yml
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'org.apache.logging.log4j.simplelog.StatusLogger.level' to TRACE to show Log4j2 internal initialization logging.
16:12:46.302 [main] ERROR com.pinterest.doctorkafka.config.DoctorKafkaConfig - Failed to initialize configuration file null
java.lang.NullPointerException: null
at java.io.File.(File.java:276) ~[?:?]
at com.pinterest.doctorkafka.config.DoctorKafkaConfig.(DoctorKafkaConfig.java:53) [doctorkafka-0.2.4.jar:?]
at com.pinterest.doctorkafka.DoctorKafkaMain.main(DoctorKafkaMain.java:86) [doctorkafka-0.2.4.jar:?]
Exception in thread "main" java.lang.NullPointerException
at com.pinterest.doctorkafka.config.DoctorKafkaConfig.getClusterZkUrls(DoctorKafkaConfig.java:85)
at com.pinterest.doctorkafka.DoctorKafka.(DoctorKafka.java:35)
at com.pinterest.doctorkafka.DoctorKafkaMain.main(DoctorKafkaMain.java:87)

If specify config file directly, the server can start, but web server is not.
java -server -cp lib/*:doctorkafka-0.2.4.jar com.pinterest.doctorkafka.DoctorKafkaMain -config config/doctorkafka.properties
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'org.apache.logging.log4j.simplelog.StatusLogger.level' to TRACE to show Log4j2 internal initialization logging.
log4j:WARN No appenders could be found for logger (org.apache.commons.beanutils.converters.BooleanConverter).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

dropwizard.yml:

config: config/doctorkafka.properties

doctorkafka.properties:

Licensed to the Apache Software Foundation (ASF) under one or more

contributor license agreements. See the NOTICE file distributed with

this work for additional information regarding copyright ownership.

The ASF licenses this file to You under the Apache License, Version 2.0

(the "License"); you may not use this file except in compliance with

the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software

distributed under the License is distributed on an "AS IS" BASIS,

WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

See the License for the specific language governing permissions and

limitations under the License.

###################################################################################

DoctorKafka global settings, including the zookeeper url for kafkastats topic

###################################################################################

[required] zookeeper quorum for storing doctorkafka metadata

doctorkafka.zkurl=localhost:2181

[required] zookeeper connection string for kafkastats topic

doctorkafka.brokerstats.zkurl=localhost:2181

[required] kafka topic name for kafkastats

doctorkafka.brokerstats.topic=brokerstats

[required] the time window that doctorkafka uses to compute

doctorkafka.brokerstats.backtrack.seconds=86400

[optional] ssl related setting for the kafka cluster that hosts brokerstats

can PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL

doctorkafka.brokerstats.consumer.security.protocol=SSL
doctorkafka.brokerstats.consumer.ssl.client.auth=required
doctorkafka.brokerstats.consumer.ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1
doctorkafka.brokerstats.consumer.ssl.endpoint.identification.algorithm=HTTPS
doctorkafka.brokerstats.consumer.ssl.key.password=key_password
doctorkafka.brokerstats.consumer.ssl.keystore.location=keystore_path
doctorkafka.brokerstats.consumer.ssl.keystore.password=keystore_password
doctorkafka.brokerstats.consumer.ssl.keystore.type=JKS
doctorkafka.brokerstats.consumer.ssl.secure.random.implementation=SHA1PRNG
doctorkafka.brokerstats.consumer.ssl.truststore.location=truststore_path
doctorkafka.brokerstats.consumer.ssl.truststore.password=truststore_password
doctorkafka.brokerstats.consumer.ssl.truststore.type=JKS

[required] zookeeper connection string for action_report topic

doctorkafka.action.report.zkurl=localhost:2181

[required] kafka topics for storing the actions that doctorkafka takes.

doctorkafka.action.report.topic=operator_report

[optional] broker replacement interval in seconds

doctorkakka.action.broker_relacement.interval.seconds=43200

[optional] broker replacement script

doctorkafka.action.broker_replacement.command="/usr/local/bin/ec2-replace-node.py -r "

[optional] ssl related settings for action report producer

doctorkafka.action.producer.security.protocol=SSL
doctorkafka.action.producer.ssl.client.auth=required
doctorkafka.action.producer.ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1
doctorkafka.action.producer.ssl.endpoint.identification.algorithm=HTTPS
doctorkafka.action.producer.ssl.key.password=key_password
doctorkafka.action.producer.ssl.keystore.location=keystore_path
doctorkafka.action.producer.ssl.keystore.password=keystore_password
doctorkafka.action.producer.ssl.keystore.type=JKS
doctorkafka.action.producer.ssl.secure.random.implementation=SHA1PRNG
doctorkafka.action.producer.ssl.truststore.location=truststore_path
doctorkafka.action.producer.ssl.truststore.password=truststore_password
doctorkafka.action.producer.ssl.truststore.type=JKS

[required] doctorkafka web port

doctorkafka.web.port=8080

[required] doctorkafka service restart interval

doctorkafka.restart.interval.seconds=86400

[optional] ostrich port

doctorkafka.ostrich.port=2052

[optional] tsd host and port.

doctorkafka.tsd.hostport=localhost:18621

[required] email addresses for sending general notification on cluster under-replication etc.

doctorkafka.emails.notification=email_address_1,email_address_2

[required] email addresses for sending alerts to

doctorkafka.emails.alert=email_address_3,email_address_4

[optional] brokerstats.version

doctorkafka.brokerstats.version=0.1.15

################################################################################

Settings for managed kafka clusters.

################################################################################

cluster1 settings

[required] whether DoctorKakfa runs in the dry run mode.

kafkacluster.cluster1.dryrun=true

[required] zookeeper url for the kafka cluster

kafkacluster.cluster1.zkurl=localhost:2181

[required] the network-inbound limit in megabytes

kafkacluster.cluster1.network.inbound.limit.mb=35

[required] the network-outbound limit in megabytes

kafkacluster.cluster1.network.outbound.limit.mb=80

[required] the broker's maximum network bandwidth

kafkacluster.cluster1.network.bandwith.max.mb=150

[optional] ssl related kafka consumer setting for accessing topic metadata info of cluster1

kafkacluster.cluster1.consumer.security.protocol=SSL
kafkacluster.cluster1.consumer.ssl.client.auth=required
kafkacluster.cluster1.consumer.ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1
kafkacluster.cluster1.consumer.ssl.endpoint.identification.algorithm=HTTPS
kafkacluster.cluster1.consumer.ssl.key.password=key_password
kafkacluster.cluster1.consumer.ssl.keystore.location=keystore_path
kafkacluster.cluster1.consumer.ssl.keystore.password=keystore_password
kafkacluster.cluster1.consumer.ssl.keystore.type=JKS
kafkacluster.cluster1.consumer.ssl.secure.random.implementation=SHA1PRNG
kafkacluster.cluster1.consumer.ssl.truststore.location=truststore_path
kafkacluster.cluster1.consumer.ssl.truststore.password=truststore_password
kafkacluster.cluster1.consumer.ssl.truststore.type=JKS

cluster2 settings

#kafkacluster.cluster2.dryrun=true
#kafkacluster.cluster2.zkurl=zookeeper001:2181,zookeeper002:2181,zookeeper003:2181/cluster2
#kafkacluster.cluster2.network.inbound.limit.mb=35
#kafkacluster.cluster2.network.outbound.limit.mb=80
#kafkacluster.cluster2.network.bandwith.max.mb=150
#kafkacluster.cluster2.check_interval_in_seconds=5
#kafkacluster.cluster2.deadbroker_replacement.enable=true
#kafkacluster.cluster2.deadbroker_replacement.no_stats.seconds=1200
#kafkacluster.cluster2.notification.email=[email protected]
#kafkacluster.cluster2.notificatino.pager=[email protected]

java.lang.NoClassDefFoundError: scala/reflect/internal/util/Statistics$

Hi

I have downloaded the doctorkafka from https://github.com/pinterest/doctorkafka.git and buid completed successfully. However while running KafkaStatMain command i get below error . Please help
Command:

java -server
-Dlog4j.configurationFile=file:./config/log4j2.xml
-cp target/lib/:target//:target/:target/kafkastats-0.2.4.4.jar:/kafka/tools/doctorkafka/kafkastats/target/classes/com/pinterest/doctorkafka/*:target/kafkastats-0.2.4.4-jar-with-dependencies.jar
com.pinterest.doctorkafka.stats.KafkaStatsMain
-broker kafka-poc-4
-jmxport 8888
-topic brokerstats
-zookeeper kafka-poc-4:2181/cluster
-uptimeinseconds 3600
-pollingintervalinseconds 60
-ostrichport 2051
-tsdhostport localhost:18126
-kafka_config /kafka/server.properties
-producer_config /kafka/producer.properties
-primary_network_ifacename eth0

ERROR:
[root@sidharth-kafka-poc-1 kafkastats]# sh dockafka.sh
2019-04-08 18:28:55,208 main DEBUG Apache Log4j Core 2.11.1 initializing configuration XmlConfiguration[location=/kafka/tools/doctorkafka/kafkastats/./config/log4j2.xml]
2019-04-08 18:28:55,414 main DEBUG Installed 2 script engines

Failed to initialize compiler: NoClassDefFoundError.
This is most often remedied by a full clean and recompile.
Otherwise, your classpath may continue bytecode compiled by
different and incompatible versions of scala.

java.lang.NoClassDefFoundError: scala/reflect/internal/util/Statistics$
at scala.tools.nsc.symtab.SymbolLoaders$ClassfileLoader.doComplete(SymbolLoaders.scala:310)
at scala.tools.nsc.symtab.SymbolLoaders$SymbolLoader.complete(SymbolLoaders.scala:213)
at scala.reflect.internal.Symbols$Symbol.info(Symbols.scala:1530)
at scala.reflect.internal.Symbols$Symbol.initialize(Symbols.scala:1678)
at scala.reflect.internal.Definitions$DefinitionsClass.init(Definitions.scala:1422)
at scala.tools.nsc.Global$Run.(Global.scala:1140)
at scala.tools.nsc.interpreter.IMain._initialize(IMain.scala:127)
at scala.tools.nsc.interpreter.IMain.initializeSynchronous(IMain.scala:149)
at scala.tools.nsc.interpreter.Scripted.(Scripted.scala:74)
at scala.tools.nsc.interpreter.Scripted$.apply(Scripted.scala:309)
at scala.tools.nsc.interpreter.Scripted$Factory.getScriptEngine(Scripted.scala:302)
at org.apache.logging.log4j.core.script.ScriptManager.(ScriptManager.java:99)
at org.apache.logging.log4j.core.config.AbstractConfiguration.initialize(AbstractConfiguration.java:216)
at org.apache.logging.log4j.core.config.AbstractConfiguration.start(AbstractConfiguration.java:250)
at org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:547)
at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:619)
at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:636)
at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:231)
at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:153)
at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:45)
at org.apache.logging.log4j.LogManager.getContext(LogManager.java:194)
at org.apache.logging.log4j.LogManager.getLogger(LogManager.java:581)
at com.pinterest.doctorkafka.stats.KafkaStatsMain.(KafkaStatsMain.java:34)
Caused by: java.lang.ClassNotFoundException: scala.reflect.internal.util.Statistics$
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 23 more

Failed to initialize compiler: NoClassDefFoundError.
This is most often remedied by a full clean and recompile.
Otherwise, your classpath may continue bytecode compiled by
different and incompatible versions of scala.

java.lang.NoClassDefFoundError: scala/reflect/internal/util/Statistics$
at scala.tools.nsc.symtab.SymbolLoaders$ClassfileLoader.doComplete(SymbolLoaders.scala:310)
at scala.tools.nsc.symtab.SymbolLoaders$SymbolLoader.complete(SymbolLoaders.scala:213)
at scala.reflect.internal.Symbols$Symbol.info(Symbols.scala:1530)
at scala.reflect.internal.Symbols$Symbol.initialize(Symbols.scala:1678)
at scala.reflect.internal.Definitions$DefinitionsClass.init(Definitions.scala:1422)
at scala.tools.nsc.Global$Run.(Global.scala:1140)
at scala.tools.nsc.interpreter.IMain._initialize(IMain.scala:127)
at scala.tools.nsc.interpreter.IMain.global$lzycompute(IMain.scala:156)
at scala.tools.nsc.interpreter.IMain.global(IMain.scala:155)
at scala.tools.nsc.interpreter.IMain.initializeSynchronous(IMain.scala:150)
at scala.tools.nsc.interpreter.Scripted.(Scripted.scala:74)
at scala.tools.nsc.interpreter.Scripted$.apply(Scripted.scala:309)
at scala.tools.nsc.interpreter.Scripted$Factory.getScriptEngine(Scripted.scala:302)
at org.apache.logging.log4j.core.script.ScriptManager.(ScriptManager.java:99)
at org.apache.logging.log4j.core.config.AbstractConfiguration.initialize(AbstractConfiguration.java:216)
at org.apache.logging.log4j.core.config.AbstractConfiguration.start(AbstractConfiguration.java:250)
at org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:547)
at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:619)
at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:636)
at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:231)
at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:153)
at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:45)
at org.apache.logging.log4j.LogManager.getContext(LogManager.java:194)
at org.apache.logging.log4j.LogManager.getLogger(LogManager.java:581)
at com.pinterest.doctorkafka.stats.KafkaStatsMain.(KafkaStatsMain.java:34)
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Exception in thread "main" java.lang.NullPointerException
at java.util.Hashtable.put(Hashtable.java:460)
at com.pinterest.doctorkafka.util.OperatorUtil.createKafkaProducerProperties(OperatorUtil.java:234)
at com.pinterest.doctorkafka.stats.KafkaAvroPublisher.(KafkaAvroPublisher.java:60)
at com.pinterest.doctorkafka.stats.KafkaStatsMain.main(KafkaStatsMain.java:135)
2019-04-08 18:28:58,428 pool-1-thread-1 DEBUG Stopping LoggerContext[name=31befd9f, org.apache.logging.log4j.core.LoggerContext@6a988392]

mvn package breaks on OpenJDK

Doctor Kafka generated invalid reassignment

We adapted to use DoctorKafka in our production system recently. However, we found that the first time after using doctorkafka, a broker host having some issues and resulted in many under replicated partitions. doctorkafka tried to resolve under replicated partitions, but it somehow generated an reassignment including the following entries: (the entries are copied from doctor kafka's log)

 {
            "topic": "mydomain.MyTopic1",
            "partition": 40,
            "replicas": [
                65633,
                65633,
                65633
            ]
        },
{
            "topic": "mydomain.MyTopic2",
            "partition": 2,
            "replicas": [
                65633,
                65633,
                65633
            ]
        },

...
This impacted our production system and delayed the downstream process. We have more than 50 broker hosts and 1 host was having trouble. I assume that the above assignment was generated due to a bug, is that correct? Have you seen this issue before? I'd appreciate if you could look into it ASAP. Thanks!

Timeout expired while fetching topic metadata

I'm clearly mis-understanding something, but I'm unable to get kafkastats to actually collect and report any stats.

My cluster is running kafka_2.12-0.11.0.2, with Java 1.8.0_131-b11.
I've built kafkastats at both 0.2.1 and 0.2.2 with identical results.

I launch kafkastats with this:

java -server -Xmx800M -Xms800M -verbosegc -Xloggc:./gc.log \
           -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=20 -XX:GCLogFileSize=20M \
    -XX:+UseG1GC -XX:MaxGCPauseMillis=250 -XX:G1ReservePercent=10 -XX:ConcGCThreads=4 \
    -XX:ParallelGCThreads=4 -XX:G1HeapRegionSize=8m -XX:InitiatingHeapOccupancyPercent=70 \
    -XX:ErrorFile=./jvm_error.log \
       -Dlog4j.configurationFile=file:./log4j2.xml  \
       -cp /mnt/kafkastats/kafkastats-0.2.2-jar-with-dependencies.jar \
       -Dbootstrap.servers=10.10.1.139:9092,10.10.1.195:9092,10.10.2.233:9092,10.10.2.148:9092 \
       com.pinterest.doctorkafka.stats.KafkaStatsMain \
         -broker 10.10.1.139 \
         -jmxport 9090  \
         -kafka_config /mnt/kafkastats/producer.properties \
         -ostrichport 2051 \
         -pollingintervalinseconds 5 \
         -topic doctorkafka.brokerstats \
         -tsdhostport localhost:18126 \
         -uptimeinseconds 99000 \
         -zookeeper 10.10.2.152:2181,10.10.2.237:2181,10.10.1.49:2181,10.10.2.104:2181,10.10.1.83:2181

My producer.properties file looks like this:

bootstrap.servers=10.10.1.139:9092,10.10.1.195:9092,10.10.2.233:9092,10.10.2.148:9092
client.id=kafkastats
compression.type=snappy
group.id=kafkastats-kafka-dev
linger.ms=5
log.dirs=/mnt/kafka/data

I've verified that all 4 Kafka brokers and all 5 Zookeeper IP addresses are correct.

I'll be attaching the log to this ticket in just a minute.

build for kafka 0.9.0.1 failed

when build for 0.9.0.1, change pom.xml and build it ; changed content below

        <dependency>
            <groupId>org.apache.kafka</groupId>
            <artifactId>kafka_2.10</artifactId>                                                                                                                                                                                     
            <version>0.9.0.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.kafka</groupId>
            <artifactId>kafka-clients</artifactId>
            <version>0.9.0.1</version>
        </dependency>

compile failed with
[ERROR] /root/git_workspace/doctorkafka/kafkastats/src/main/java/com/pinterest/doctorkafka/util/OperatorUtil.java:[18,38] error: cannot find symbol

then refer code

 18 import org.apache.kafka.common.network.ListenerName;                                                                                                                                                                            
 19 import org.apache.kafka.common.protocol.SecurityProtocol;

can not Compatible with kafka of backward version? any help with appreciate

Display Dr. Kafka action status in UI

Dr. Kafka actions currently don't show if the actions taken by Dr. Kafka have succeeded or not. This will be an extremely useful feature for operations to know if:

  1. Broker replacements command succeeded
  2. Reassignments have succeeded

ec2metadata and build.properties are missing in the project source

Please find the classes where we are using these references.
1)BrokerStatsRetriever -> Process process = Runtime.getRuntime().exec("ec2metadata");
2)OstrichAdminService -> properties.load(this.getClass().getResource("build.properties").openStream());

Also want to know if anyone able resolve the following issue related to OpenTSDB.
com.pinterest.doctorkafka.util.MetricsPusher - Failed to send stats to OpenTSDB,

Release Cycle

Just want to say up front this project looks awesome! Totally caught my eye for a few reasons, as dealing with failed brokers & topic under-replication are, in my opinion, two of the biggest pain points of cluster management/maintenance (plus the UI is sweet). However, I noticed there haven't been any releases yet, and I would be a bit hesitant to put it on a production cluster given such. Are there are plans for an official release? Would really love to use doctorkafka and contribute back, thank you very much.

Question: Why does DoctorKafka rely on same cluster it is trying to heal? Isn't that an architecture problem?

Hi

I am curious about the architecture and design of DoctorKafka.

Usually a system (e.g DoctorKafka) that checks on the health of another system should be isolated from the main system to be checked (Kafka Cluster). So I am curious about the design decision to use Kafka topic from within the cluster (the same cluster we are trying to auto-heal and check health of) to report metrics to be consumed by DoctorKafka.

Can someone please clarify and help me understand why this is a good design decision?
What was the thought process on the fact DoctorKafka relies on a topic within the same cluster it is trying to heal?

Kubernetes

Hello, does doctorkafka is going to be ported to deal with Kafka clusters deployed on kubernetes ?

doctorkafka doesn't auto-restart

My configuration has

doctorkafka.restart.interval.seconds=86400

However, my DoctorKafka servers have status lines like this:

Version: 0.2.4.3, Uptime: 437164.017 seconds

The kafkastats processes are certainly restarting as they are supposed to, but doctorkafka seems to ignore this setting.

rack information is ignored when generating a partition reassignment plan.

I just discovered today that a bunch of my clusters are no longer resilient to AZ failure because many partitions DoctorKafka has moved exist in only a single rack (AZ). I see availability-zone is collected correctly in AWS, but rackId doesn't seem to be populated even though broker.rack is set in all my server.properties files.

Checking the code, I see no references to rackId in the assignment generator.

Errors in starting KafkaStatsMain 0.2.4.3

Getting the following errors while running KafkaStats. Would require someones guidance to fix the following issues.

Errors:
1.ERROR com.pinterest.doctorkafka.stats.BrokerStatsRetriever - Failed to get ec2 metadata
2.ERROR com.pinterest.doctorkafka.util.OperatorUtil - Failed to connect to MBeanServer bd-venkat:9999
3.WARN com.pinterest.doctorkafka.util.MetricsPusher - Failed to send stats to OpenTSDB, will retry up to next interval
com.pinterest.doctorkafka.util.OpenTsdbClient$ConnectionFailedException: java.net.ConnectException: Connection refused (Connection refused)

Starting command:
java -server
-Dlog4j.configurationFile=file:./log4j2.xml
-cp libs/*:kafkastats-0.2.4.3-jar-with-dependencies.jar
com.pinterest.doctorkafka.stats.KafkaStatsMain
-broker localhost
-jmxport 9999
-topic brokerstats
-zookeeper localhost:2181
-uptimeinseconds 3600
-pollingintervalinseconds 60
-ostrichport 2051
-tsdhostport localhost:18126
-kafka_config /Users/venkat/kafka_2.11-1.1.0/config/server1.properties
-producer_config /Users/venkat/kafka_2.11-1.1.0/config/producer.properties
-primary_network_ifacename eth0

Detailed Log:
20:13:17.386 [ZkClient-EventThread-12-localhost:2181] INFO org.I0Itec.zkclient.ZkEventThread - Starting ZkClient event thread.
20:13:17.473 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
20:13:17.473 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:host.name=192.168.0.103
20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.version=1.8.0_191
20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.vendor=Oracle Corporation
20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.home=/Library/Java/JavaVirtualMachines/jdk1.8.0_191.jdk/Contents/Home/jre
20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.class.path=libs/kafkastats-0.2.4.3-jar-with-dependencies.jar:libs/doctorkafka-0.2.4.3-jar-with-dependencies.jar:libs/doctorkafka-0.2.4.3.jar:libs/kafkastats-0.2.4.3.jar:kafkastats-0.2.4.3-jar-with-dependencies.jar
20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.library.path=/Users/venkat/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:.
20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.io.tmpdir=/var/folders/28/qrgydfnx1jv4cl703nzp3lhc92z1zs/T/
20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.compiler=
20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:os.name=Mac OS X
20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:os.arch=x86_64
20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:os.version=10.13.6
20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:user.name=venkat
20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:user.home=/Users/venkat
20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:user.dir=/Users/venkat/softwares/doctorkafka
20:13:17.475 [main] INFO org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.I0Itec.zkclient.ZkClient@4bdeaabb
20:13:17.483 [main] DEBUG org.apache.zookeeper.ClientCnxn - zookeeper.disableAutoWatchReset is false
20:13:17.498 [main] DEBUG org.I0Itec.zkclient.ZkClient - Awaiting connection to Zookeeper server
20:13:17.498 [main] INFO org.I0Itec.zkclient.ZkClient - Waiting for keeper state SyncConnected
20:13:17.500 [main-SendThread(localhost:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
20:13:17.522 [main-SendThread(localhost:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established to localhost/127.0.0.1:2181, initiating session
20:13:17.524 [main-SendThread(localhost:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Session establishment request sent on localhost/127.0.0.1:2181
20:13:17.532 [main-SendThread(localhost:2181)] INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x1689a86df6e0073, negotiated timeout = 30000
20:13:17.534 [main-EventThread] DEBUG org.I0Itec.zkclient.ZkClient - Received event: WatchedEvent state:SyncConnected type:None path:null
20:13:17.535 [main-EventThread] INFO org.I0Itec.zkclient.ZkClient - zookeeper state changed (SyncConnected)
20:13:17.535 [main-EventThread] DEBUG org.I0Itec.zkclient.ZkClient - Leaving process event
20:13:17.535 [main] DEBUG org.I0Itec.zkclient.ZkClient - State is SyncConnected
20:13:17.548 [main] INFO kafka.utils.Log4jControllerRegistration$ - Registered kafka:type=kafka.Log4jController MBean
20:13:17.612 [main-SendThread(localhost:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x1689a86df6e0073, packet:: clientPath:null serverPath:null finished:false header:: 1,8 replyHeader:: 1,1251,0 request:: '/brokers/ids,F response:: v{'0,'1,'2}
20:13:17.625 [main-SendThread(localhost:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x1689a86df6e0073, packet:: clientPath:null serverPath:null finished:false header:: 2,4 replyHeader:: 2,1251,0 request:: '/brokers/ids/0,F response:: #7b226c697374656e65725f73656375726974795f70726f746f636f6c5f6d6170223a7b22504c41494e54455854223a22504c41494e54455854227d2c22656e64706f696e7473223a5b22504c41494e544558543a2f2f6c6f63616c686f73743a39303932225d2c226a6d785f706f7274223a2d312c22686f7374223a226c6f63616c686f7374222c2274696d657374616d70223a2231353438383535383736373337222c22706f7274223a393039322c2276657273696f6e223a347d,s{1209,1209,1548855876737,1548855876737,0,0,0,101500895680659568,188,0,1209}
20:13:17.873 [main-SendThread(localhost:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x1689a86df6e0073, packet:: clientPath:null serverPath:null finished:false header:: 3,4 replyHeader:: 3,1251,0 request:: '/brokers/ids/1,F response:: #7b226c697374656e65725f73656375726974795f70726f746f636f6c5f6d6170223a7b22504c41494e54455854223a22504c41494e54455854227d2c22656e64706f696e7473223a5b22504c41494e544558543a2f2f6c6f63616c686f73743a39303933225d2c226a6d785f706f7274223a2d312c22686f7374223a226c6f63616c686f7374222c2274696d657374616d70223a2231353438383535383738313735222c22706f7274223a393039332c2276657273696f6e223a347d,s{1214,1214,1548855878178,1548855878178,0,0,0,101500895680659569,188,0,1214}
20:13:17.874 [main-SendThread(localhost:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x1689a86df6e0073, packet:: clientPath:null serverPath:null finished:false header:: 4,4 replyHeader:: 4,1251,0 request:: '/brokers/ids/2,F response:: #7b226c697374656e65725f73656375726974795f70726f746f636f6c5f6d6170223a7b22504c41494e54455854223a22504c41494e54455854227d2c22656e64706f696e7473223a5b22504c41494e544558543a2f2f6c6f63616c686f73743a39303934225d2c226a6d785f706f7274223a2d312c22686f7374223a226c6f63616c686f7374222c2274696d657374616d70223a2231353438383535383738323132222c22706f7274223a393039342c2276657273696f6e223a347d,s{1233,1233,1548855878214,1548855878214,0,0,0,101500895680659570,188,0,1233}
20:13:17.906 [main] INFO org.apache.kafka.clients.producer.ProducerConfig - ProducerConfig values:
acks = 1
batch.size = 1638400
bootstrap.servers = [localhost:9092]
buffer.memory = 3554432
client.id =
compression.type = none
connections.max.idle.ms = 540000
enable.idempotence = false
interceptor.classes = []
key.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
linger.ms = 0
max.block.ms = 60000
max.in.flight.requests.per.connection = 5
max.request.size = 1048576
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
receive.buffer.bytes = 32768
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
retries = 0
retry.backoff.ms = 100
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
ssl.endpoint.identification.algorithm = null
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLS
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
transaction.timeout.ms = 60000
transactional.id = null
value.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer

20:13:17.927 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name bufferpool-wait-time
20:13:17.932 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name buffer-exhausted-records
20:13:17.936 [main] DEBUG org.apache.kafka.clients.Metadata - Updated cluster metadata version 1 to Cluster(id = null, nodes = [localhost:9092 (id: -1 rack: null)], partitions = [])
20:13:17.942 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name produce-throttle-time
20:13:17.952 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name connections-closed:
20:13:17.952 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name connections-created:
20:13:17.953 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name successful-authentication:
20:13:17.953 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name failed-authentication:
20:13:17.954 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name bytes-sent-received:
20:13:17.954 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name bytes-sent:
20:13:17.955 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name bytes-received:
20:13:17.955 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name select-time:
20:13:17.956 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name io-time:
20:13:17.959 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name batch-size
20:13:17.960 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name compression-rate
20:13:17.960 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name queue-time
20:13:17.960 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name request-time
20:13:17.960 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name records-per-request
20:13:17.961 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name record-retries
20:13:17.961 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name errors
20:13:17.961 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name record-size
20:13:17.962 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name batch-split-rate
20:13:17.963 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.producer.internals.Sender - [Producer clientId=producer-1] Starting Kafka producer I/O thread.
20:13:17.964 [main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka version : 1.1.0
20:13:17.964 [main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka commitId : fdcf75ea326b8e07
20:13:17.965 [main] DEBUG org.apache.kafka.clients.producer.KafkaProducer - [Producer clientId=producer-1] Kafka producer started
20:13:17.970 [main] INFO com.pinterest.doctorkafka.stats.BrokerStatsReporter - Starting broker stats reporter.....
20:13:17.988 [StatsReporter] ERROR com.pinterest.doctorkafka.stats.BrokerStatsRetriever - Failed to get ec2 metadata
java.io.IOException: Cannot run program "ec2metadata": error=2, No such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048) ~[?:1.8.0_191]
at java.lang.Runtime.exec(Runtime.java:620) ~[?:1.8.0_191]
at java.lang.Runtime.exec(Runtime.java:450) ~[?:1.8.0_191]
at java.lang.Runtime.exec(Runtime.java:347) ~[?:1.8.0_191]
at com.pinterest.doctorkafka.stats.BrokerStatsRetriever.setBrokerInstanceInfo(BrokerStatsRetriever.java:409) [kafkastats-0.2.4.3-jar-with-dependencies.jar:?]
at com.pinterest.doctorkafka.stats.BrokerStatsRetriever.retrieveBrokerStats(BrokerStatsRetriever.java:579) [kafkastats-0.2.4.3-jar-with-dependencies.jar:?]
at com.pinterest.doctorkafka.stats.BrokerStatsReporter.run(BrokerStatsReporter.java:57) [kafkastats-0.2.4.3-jar-with-dependencies.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_191]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_191]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_191]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_191]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_191]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_191]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]
Caused by: java.io.IOException: error=2, No such file or directory
at java.lang.UNIXProcess.forkAndExec(Native Method) ~[?:1.8.0_191]
at java.lang.UNIXProcess.(UNIXProcess.java:247) ~[?:1.8.0_191]
at java.lang.ProcessImpl.start(ProcessImpl.java:134) ~[?:1.8.0_191]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) ~[?:1.8.0_191]
... 13 more
20:13:17.997 [StatsReporter] INFO com.pinterest.doctorkafka.stats.BrokerStatsRetriever - set hostname to bd-venkat
Jan 30, 2019 8:13:18 PM com.twitter.ostrich.admin.BackgroundProcess start
INFO: Starting LatchedStatsListener
20:13:18.140 [StatsReporter] ERROR com.pinterest.doctorkafka.util.OperatorUtil - Failed to connect to MBeanServer bd-venkat:9999
java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: localhost; nested exception is:
java.net.ConnectException: Connection refused (Connection refused)]
at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:369) ~[?:1.8.0_191]
at javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:270) ~[?:1.8.0_191]
at com.pinterest.doctorkafka.util.OperatorUtil.getMBeanServerConnection(OperatorUtil.java:90) [kafkastats-0.2.4.3-jar-with-dependencies.jar:?]
at com.pinterest.doctorkafka.stats.BrokerStatsRetriever.retrieveBrokerStats(BrokerStatsRetriever.java:588) [kafkastats-0.2.4.3-jar-with-dependencies.jar:?]
at com.pinterest.doctorkafka.stats.BrokerStatsReporter.run(BrokerStatsReporter.java:57) [kafkastats-0.2.4.3-jar-with-dependencies.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_191]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_191]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_191]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_191]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_191]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_191]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]
Caused by: javax.naming.ServiceUnavailableException
at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:136) ~[?:1.8.0_191]
at com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:205) ~[?:1.8.0_191]
at javax.naming.InitialContext.lookup(InitialContext.java:417) ~[?:1.8.0_191]
at javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1955) ~[?:1.8.0_191]
at javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1922) ~[?:1.8.0_191]
at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:287) ~[?:1.8.0_191]
... 11 more
Caused by: java.rmi.ConnectException: Connection refused to host: localhost; nested exception is:
java.net.ConnectException: Connection refused (Connection refused)
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:619) ~[?:1.8.0_191]
at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216) ~[?:1.8.0_191]
at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202) ~[?:1.8.0_191]
at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:338) ~[?:1.8.0_191]
at sun.rmi.registry.RegistryImpl_Stub.lookup(RegistryImpl_Stub.java:112) ~[?:1.8.0_191]
at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:132) ~[?:1.8.0_191]
at com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:205) ~[?:1.8.0_191]
at javax.naming.InitialContext.lookup(InitialContext.java:417) ~[?:1.8.0_191]
at javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1955) ~[?:1.8.0_191]
at javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1922) ~[?:1.8.0_191]
at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:287) ~[?:1.8.0_191]
... 11 more
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_191]
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_191]
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:1.8.0_191]
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:1.8.0_191]
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_191]
at java.net.Socket.connect(Socket.java:589) ~[?:1.8.0_191]
at java.net.Socket.connect(Socket.java:538) ~[?:1.8.0_191]
at java.net.Socket.(Socket.java:434) ~[?:1.8.0_191]
at java.net.Socket.(Socket.java:211) ~[?:1.8.0_191]
at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40) ~[?:1.8.0_191]
at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:148) ~[?:1.8.0_191]
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613) ~[?:1.8.0_191]
at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216) ~[?:1.8.0_191]
at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202) ~[?:1.8.0_191]
at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:338) ~[?:1.8.0_191]
at sun.rmi.registry.RegistryImpl_Stub.lookup(RegistryImpl_Stub.java:112) ~[?:1.8.0_191]
at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:132) ~[?:1.8.0_191]
at com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:205) ~[?:1.8.0_191]
at javax.naming.InitialContext.lookup(InitialContext.java:417) ~[?:1.8.0_191]
at javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1955) ~[?:1.8.0_191]
at javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1922) ~[?:1.8.0_191]
at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:287) ~[?:1.8.0_191]
... 11 more
20:13:18.159 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Initialize connection to node localhost:9092 (id: -1 rack: null) for sending metadata request
20:13:18.159 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Initiating connection to node localhost:9092 (id: -1 rack: null)
20:13:18.162 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name node--1.bytes-sent
20:13:18.163 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name node--1.bytes-received
20:13:18.164 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name node--1.latency
20:13:18.166 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.network.Selector - [Producer clientId=producer-1] Created socket with SO_RCVBUF = 326640, SO_SNDBUF = 146988, SO_TIMEOUT = 0 to node -1
20:13:18.254 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Completed connection to node -1. Fetching API versions.
20:13:18.254 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Initiating API versions fetch from node -1.
20:13:18.262 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Recorded API versions for node -1: (Produce(0): 0 to 5 [usable: 5], Fetch(1): 0 to 7 [usable: 7], ListOffsets(2): 0 to 2 [usable: 2], Metadata(3): 0 to 5 [usable: 5], LeaderAndIsr(4): 0 to 1 [usable: 1], StopReplica(5): 0 [usable: 0], UpdateMetadata(6): 0 to 4 [usable: 4], ControlledShutdown(7): 0 to 1 [usable: 1], OffsetCommit(8): 0 to 3 [usable: 3], OffsetFetch(9): 0 to 3 [usable: 3], FindCoordinator(10): 0 to 1 [usable: 1], JoinGroup(11): 0 to 2 [usable: 2], Heartbeat(12): 0 to 1 [usable: 1], LeaveGroup(13): 0 to 1 [usable: 1], SyncGroup(14): 0 to 1 [usable: 1], DescribeGroups(15): 0 to 1 [usable: 1], ListGroups(16): 0 to 1 [usable: 1], SaslHandshake(17): 0 to 1 [usable: 1], ApiVersions(18): 0 to 1 [usable: 1], CreateTopics(19): 0 to 2 [usable: 2], DeleteTopics(20): 0 to 1 [usable: 1], DeleteRecords(21): 0 [usable: 0], InitProducerId(22): 0 [usable: 0], OffsetForLeaderEpoch(23): 0 [usable: 0], AddPartitionsToTxn(24): 0 [usable: 0], AddOffsetsToTxn(25): 0 [usable: 0], EndTxn(26): 0 [usable: 0], WriteTxnMarkers(27): 0 [usable: 0], TxnOffsetCommit(28): 0 [usable: 0], DescribeAcls(29): 0 [usable: 0], CreateAcls(30): 0 [usable: 0], DeleteAcls(31): 0 [usable: 0], DescribeConfigs(32): 0 to 1 [usable: 1], AlterConfigs(33): 0 [usable: 0], AlterReplicaLogDirs(34): 0 [usable: 0], DescribeLogDirs(35): 0 [usable: 0], SaslAuthenticate(36): 0 [usable: 0], CreatePartitions(37): 0 [usable: 0], CreateDelegationToken(38): 0 [usable: 0], RenewDelegationToken(39): 0 [usable: 0], ExpireDelegationToken(40): 0 [usable: 0], DescribeDelegationToken(41): 0 [usable: 0], DeleteGroups(42): 0 [usable: 0])
20:13:18.263 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Sending metadata request (type=MetadataRequest, topics=brokerstats) to node localhost:9092 (id: -1 rack: null)
20:13:18.265 [kafka-producer-network-thread | producer-1] INFO org.apache.kafka.clients.Metadata - Cluster ID: 6xm2CYjVSye8SZnX4_By5Q
20:13:18.266 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.Metadata - Updated cluster metadata version 2 to Cluster(id = 6xm2CYjVSye8SZnX4_By5Q, nodes = [localhost:9092 (id: 0 rack: null), localhost:9093 (id: 1 rack: null), localhost:9094 (id: 2 rack: null)], partitions = [Partition(topic = brokerstats, partition = 0, leader = 1, replicas = [1], isr = [1], offlineReplicas = [])])
20:13:18.280 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Initiating connection to node localhost:9093 (id: 1 rack: null)
20:13:18.280 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name node-1.bytes-sent
20:13:18.281 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name node-1.bytes-received
20:13:18.281 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name node-1.latency
20:13:18.282 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.network.Selector - [Producer clientId=producer-1] Created socket with SO_RCVBUF = 326640, SO_SNDBUF = 146988, SO_TIMEOUT = 0 to node 1
20:13:18.282 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Completed connection to node 1. Fetching API versions.
20:13:18.282 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Initiating API versions fetch from node 1.
20:13:18.283 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Recorded API versions for node 1: (Produce(0): 0 to 5 [usable: 5], Fetch(1): 0 to 7 [usable: 7], ListOffsets(2): 0 to 2 [usable: 2], Metadata(3): 0 to 5 [usable: 5], LeaderAndIsr(4): 0 to 1 [usable: 1], StopReplica(5): 0 [usable: 0], UpdateMetadata(6): 0 to 4 [usable: 4], ControlledShutdown(7): 0 to 1 [usable: 1], OffsetCommit(8): 0 to 3 [usable: 3], OffsetFetch(9): 0 to 3 [usable: 3], FindCoordinator(10): 0 to 1 [usable: 1], JoinGroup(11): 0 to 2 [usable: 2], Heartbeat(12): 0 to 1 [usable: 1], LeaveGroup(13): 0 to 1 [usable: 1], SyncGroup(14): 0 to 1 [usable: 1], DescribeGroups(15): 0 to 1 [usable: 1], ListGroups(16): 0 to 1 [usable: 1], SaslHandshake(17): 0 to 1 [usable: 1], ApiVersions(18): 0 to 1 [usable: 1], CreateTopics(19): 0 to 2 [usable: 2], DeleteTopics(20): 0 to 1 [usable: 1], DeleteRecords(21): 0 [usable: 0], InitProducerId(22): 0 [usable: 0], OffsetForLeaderEpoch(23): 0 [usable: 0], AddPartitionsToTxn(24): 0 [usable: 0], AddOffsetsToTxn(25): 0 [usable: 0], EndTxn(26): 0 [usable: 0], WriteTxnMarkers(27): 0 [usable: 0], TxnOffsetCommit(28): 0 [usable: 0], DescribeAcls(29): 0 [usable: 0], CreateAcls(30): 0 [usable: 0], DeleteAcls(31): 0 [usable: 0], DescribeConfigs(32): 0 to 1 [usable: 1], AlterConfigs(33): 0 [usable: 0], AlterReplicaLogDirs(34): 0 [usable: 0], DescribeLogDirs(35): 0 [usable: 0], SaslAuthenticate(36): 0 [usable: 0], CreatePartitions(37): 0 [usable: 0], CreateDelegationToken(38): 0 [usable: 0], RenewDelegationToken(39): 0 [usable: 0], ExpireDelegationToken(40): 0 [usable: 0], DescribeDelegationToken(41): 0 [usable: 0], DeleteGroups(42): 0 [usable: 0])
20:13:18.287 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name topic.brokerstats.records-per-batch
20:13:18.287 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name topic.brokerstats.bytes
20:13:18.287 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name topic.brokerstats.compression-rate
20:13:18.287 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name topic.brokerstats.record-retries
20:13:18.288 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name topic.brokerstats.record-errors
20:13:18.301 [StatsReporter] INFO com.pinterest.doctorkafka.stats.BrokerStatsReporter - published to kafka : {"timestamp": 1548859397978, "id": 1, "name": "bd-venkat", "zkUrl": "localhost:2181", "kafkaVersion": "1.1.0", "statsVersion": "0.1.15", "hasFailure": true, "failureReason": "JmxConnectionFailure", "availabilityZone": null, "instanceType": null, "amiId": null, "rackId": null, "logFilesPath": "/tmp/kafka-logs1", "cpuUsage": 0.0, "freeDiskSpaceInBytes": 196621987840, "totalDiskSpaceInBytes": 250790436864, "leadersBytesIn1MinRate": 0, "leadersBytesOut1MinRate": 0, "leadersBytesIn5MinRate": 0, "leadersBytesOut5MinRate": 0, "leadersBytesIn15MinRate": 0, "leadersBytesOut15MinRate": 0, "sysBytesIn1MinRate": 0, "sysBytesOut1MinRate": 0, "numReplicas": 0, "numLeaders": 0, "topicsBytesIn1MinRate": null, "topicsBytesOut1MinRate": null, "topicsBytesIn5MinRate": null, "topicsBytesOut5MinRate": null, "topicsBytesIn15MinRate": null, "topicsBytesOut15MinRate": null, "leaderReplicaStats": null, "leaderReplicas": null, "followerReplicas": null, "inReassignmentReplicas": null}
Jan 30, 2019 8:13:18 PM com.twitter.ostrich.admin.AdminHttpService start
INFO: Admin HTTP interface started on port 2051.
20:13:18.345 [main] INFO com.pinterest.doctorkafka.util.OperatorUtil - Starting the OpenTsdb metrics pusher
20:13:18.352 [main] INFO com.pinterest.doctorkafka.util.OperatorUtil - OpenTsdb metrics pusher started!
20:13:27.875 [main-SendThread(localhost:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 0x1689a86df6e0073 after 0ms
20:13:28.374 [Thread-5] DEBUG com.pinterest.doctorkafka.util.MetricsPusher - Ostrich Metrics 1548859408:
put KafkaOperator.jvm_gc_PS_MarkSweep_msec 1548859408 0.0 host=bd-venkat
put KafkaOperator.jvm_gc_PS_Scavenge_cycles 1548859408 0.0 host=bd-venkat
put KafkaOperator.jvm_gc_cycles 1548859408 0.0 host=bd-venkat
put KafkaOperator.jvm_gc_msec 1548859408 0.0 host=bd-venkat
put KafkaOperator.jvm_gc_PS_Scavenge_msec 1548859408 0.0 host=bd-venkat
put KafkaOperator.jvm_gc_PS_MarkSweep_cycles 1548859408 0.0 host=bd-venkat
put KafkaOperator.kafka.stats.collector.success 1548859408 0.0 host=bd-venkat host=bd-venkat
put KafkaOperator.jvm_post_gc_PS_Survivor_Space_used 1548859408 0.0 host=bd-venkat
put KafkaOperator.jvm_thread_daemon_count 1548859408 10.0 host=bd-venkat
put KafkaOperator.jvm_current_mem_used 1548859408 9.5334352E7 host=bd-venkat
put KafkaOperator.jvm_buffer_direct_used 1548859408 826.0 host=bd-venkat
put KafkaOperator.jvm_thread_peak_count 1548859408 15.0 host=bd-venkat
put KafkaOperator.jvm_fd_limit 1548859408 10240.0 host=bd-venkat
put KafkaOperator.jvm_fd_count 1548859408 36.0 host=bd-venkat
put KafkaOperator.jvm_buffer_mapped_count 1548859408 0.0 host=bd-venkat
put KafkaOperator.jvm_current_mem_PS_Old_Gen_used 1548859408 1.2706672E7 host=bd-venkat
put KafkaOperator.jvm_current_mem_Metaspace_max 1548859408 -1.0 host=bd-venkat
put KafkaOperator.jvm_post_gc_PS_Old_Gen_used 1548859408 1.2706672E7 host=bd-venkat
put KafkaOperator.jvm_num_cpus 1548859408 8.0 host=bd-venkat
put KafkaOperator.jvm_current_mem_Code_Cache_used 1548859408 5230912.0 host=bd-venkat
put KafkaOperator.jvm_start_time 1548859408 1.54885934E12 host=bd-venkat
put KafkaOperator.jvm_nonheap_used 1548859408 4.1529464E7 host=bd-venkat
put KafkaOperator.jvm_current_mem_PS_Eden_Space_max 1548859408 1.40928614E9 host=bd-venkat
put KafkaOperator.jvm_buffer_direct_count 1548859408 4.0 host=bd-venkat
put KafkaOperator.jvm_current_mem_Compressed_Class_Space_max 1548859408 1.07374182E9 host=bd-venkat
put KafkaOperator.jvm_heap_max 1548859408 3.81786522E9 host=bd-venkat
put KafkaOperator.jvm_compilation_time_msec 1548859408 1793.0 host=bd-venkat
put KafkaOperator.jvm_post_gc_PS_Survivor_Space_max 1548859408 1.1010048E7 host=bd-venkat
put KafkaOperator.jvm_buffer_mapped_max 1548859408 0.0 host=bd-venkat
put KafkaOperator.jvm_buffer_mapped_used 1548859408 0.0 host=bd-venkat
put KafkaOperator.jvm_nonheap_max 1548859408 -1.0 host=bd-venkat
put KafkaOperator.jvm_nonheap_committed 1548859408 4.2631168E7 host=bd-venkat
put KafkaOperator.jvm_buffer_direct_max 1548859408 826.0 host=bd-venkat
put KafkaOperator.jvm_post_gc_PS_Eden_Space_max 1548859408 1.40928614E9 host=bd-venkat
put KafkaOperator.jvm_post_gc_used 1548859408 1.2706672E7 host=bd-venkat
put KafkaOperator.jvm_classes_total_loaded 1548859408 5137.0 host=bd-venkat
put KafkaOperator.jvm_current_mem_Metaspace_used 1548859408 3.1875904E7 host=bd-venkat
put KafkaOperator.jvm_post_gc_PS_Eden_Space_used 1548859408 0.0 host=bd-venkat
put KafkaOperator.jvm_current_mem_PS_Old_Gen_max 1548859408 2.86366106E9 host=bd-venkat
put KafkaOperator.jvm_current_mem_PS_Eden_Space_used 1548859408 4.1095296E7 host=bd-venkat
put KafkaOperator.jvm_classes_current_loaded 1548859408 5137.0 host=bd-venkat
put KafkaOperator.jvm_current_mem_Compressed_Class_Space_used 1548859408 4425568.0 host=bd-venkat
put KafkaOperator.jvm_current_mem_PS_Survivor_Space_used 1548859408 0.0 host=bd-venkat
put KafkaOperator.jvm_uptime 1548859408 12302.0 host=bd-venkat
put KafkaOperator.jvm_current_mem_Code_Cache_max 1548859408 2.5165824E8 host=bd-venkat
put KafkaOperator.jvm_classes_total_unloaded 1548859408 0.0 host=bd-venkat
put KafkaOperator.jvm_current_mem_PS_Survivor_Space_max 1548859408 1.1010048E7 host=bd-venkat
put KafkaOperator.jvm_heap_used 1548859408 5.3801968E7 host=bd-venkat
put KafkaOperator.jvm_heap_committed 1548859408 2.00802304E8 host=bd-venkat
put KafkaOperator.jvm_post_gc_PS_Old_Gen_max 1548859408 2.86366106E9 host=bd-venkat
put KafkaOperator.jvm_thread_count 1548859408 15.0 host=bd-venkat

20:13:28.377 [Thread-5] WARN com.pinterest.doctorkafka.util.MetricsPusher - Failed to send stats to OpenTSDB, will retry up to next interval
com.pinterest.doctorkafka.util.OpenTsdbClient$ConnectionFailedException: java.net.ConnectException: Connection refused (Connection refused)
at com.pinterest.doctorkafka.util.OpenTsdbClient.sendMetrics(OpenTsdbClient.java:109)
at com.pinterest.doctorkafka.util.MetricsPusher.sendMetrics(MetricsPusher.java:101)
at com.pinterest.doctorkafka.util.MetricsPusher.run(MetricsPusher.java:129)
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at com.pinterest.doctorkafka.util.OpenTsdbClient.sendMetrics(OpenTsdbClient.java:107)
... 2 common frames omitted
20:13:28.483 [Thread-5] WARN com.pinterest.doctorkafka.util.MetricsPusher - Failed to send stats to OpenTSDB, will retry up to next interval
com.pinterest.doctorkafka.util.OpenTsdbClient$ConnectionFailedException: java.net.ConnectException: Connection refused (Connection refused)
at com.pinterest.doctorkafka.util.OpenTsdbClient.sendMetrics(OpenTsdbClient.java:109)
at com.pinterest.doctorkafka.util.MetricsPusher.sendMetrics(MetricsPusher.java:101)
at com.pinterest.doctorkafka.util.MetricsPusher.run(MetricsPusher.java:129)
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at com.pinterest.doctorkafka.util.OpenTsdbClient.sendMetrics(OpenTsdbClient.java:107)
... 2 common frames omitted
^C20:13:30.857 [Thread-5] WARN com.pinterest.doctorkafka.util.MetricsPusher - Failed to send stats to OpenTSDB, will retry up to next interval
com.pinterest.doctorkafka.util.OpenTsdbClient$ConnectionFailedException: java.net.ConnectException: Connection refused (Connection refused)
at com.pinterest.doctorkafka.util.OpenTsdbClient.sendMetrics(OpenTsdbClient.java:109)
at com.pinterest.doctorkafka.util.MetricsPusher.sendMetrics(MetricsPusher.java:101)
at com.pinterest.doctorkafka.util.MetricsPusher.run(MetricsPusher.java:129)
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at com.pinterest.doctorkafka.util.OpenTsdbClient.sendMetrics(OpenTsdbClient.java:107)
... 2 common frames omitted
20:13:30.859 [Thread-1] INFO org.apache.kafka.clients.producer.KafkaProducer - [Producer clientId=producer-1] Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms.
20:13:30.859 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.producer.internals.Sender - [Producer clientId=producer-1] Beginning shutdown of Kafka producer I/O thread, sending remaining records.
20:13:30.863 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name connections-closed:
20:13:30.863 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name connections-created:
20:13:30.863 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name successful-authentication:
20:13:30.864 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name failed-authentication:
20:13:30.864 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name bytes-sent-received:
20:13:30.864 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name bytes-sent:
20:13:30.865 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name bytes-received:
20:13:30.865 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name select-time:
20:13:30.865 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name io-time:
20:13:30.866 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name node--1.bytes-sent
20:13:30.866 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name node--1.bytes-received
20:13:30.866 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name node--1.latency
20:13:30.867 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name node-1.bytes-sent
20:13:30.867 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name node-1.bytes-received
20:13:30.867 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name node-1.latency
20:13:30.867 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.producer.internals.Sender - [Producer clientId=producer-1] Shutdown of Kafka producer I/O thread has completed.
20:13:30.868 [Thread-1] DEBUG org.apache.kafka.clients.producer.KafkaProducer - [Producer clientId=producer-1] Kafka producer has been closed

ERROR com.pinterest.doctorkafka.DoctorKafka - No brokerstats info for cluster

Hi,

I am getting below error while starting doctorkafka and doctorkafka ui is not populated with cluster details
I have pasted the doctorkafka UI screenshot and startup log
INFO [2019-04-18 10:26:36,215] org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/kafka/doctorkafka-0.2.4.5-build
INFO [2019-04-18 10:26:36,217] org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=kafka-poc-2.com:2181,ka
fka-poc-3.com:2181,kafka-poc-4.com:2181 sessionTimeout=60000 watcher=org.apache.curator.ConnectionState@65383667
INFO [2019-04-18 10:26:36,246] org.apache.zookeeper.ClientCnxn: Opening socket connection to server kafka-poc-2.com/192.168.100.6:
2181. Will not attempt to authenticate using SASL (unknown error)
INFO [2019-04-18 10:26:36,247] org.apache.curator.framework.imps.CuratorFrameworkImpl: Default schema
WARN [2019-04-18 10:26:36,262] org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and att
empting reconnect
! java.net.ConnectException: Connection refused
! at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
! at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
! at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
! at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
INFO [2019-04-18 10:26:36,364] org.apache.zookeeper.ClientCnxn: Opening socket connection to server kafka-poc-2.com/10.144.179.90:2181. Will not attempt to authenticate using SASL (unknown error)
WARN [2019-04-18 10:26:36,365] org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
! java.net.ConnectException: Connection refused
! at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
! at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
! at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
! at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
INFO [2019-04-18 10:26:36,466] org.apache.zookeeper.ClientCnxn: Opening socket connection to server kafka-poc-4.com/192.168.100.8:2181. Will not attempt to authenticate using SASL (unknown error)
INFO [2019-04-18 10:26:36,468] org.apache.zookeeper.ClientCnxn: Socket connection established to kafka-poc-4.com/192.168.100.8:2181, initiating session
WARN [2019-04-18 10:26:36,554] com.pinterest.doctorkafka.util.OstrichAdminService: Failed to load properties from build.properties
! java.lang.NullPointerException: null
! at com.pinterest.doctorkafka.util.OstrichAdminService.startAdminHttpService(OstrichAdminService.java:39)
! at com.pinterest.doctorkafka.util.OperatorUtil.startOstrichService(OperatorUtil.java:282)
! at com.pinterest.doctorkafka.DoctorKafkaMain.startMetricsService(DoctorKafkaMain.java:130)
! at com.pinterest.doctorkafka.DoctorKafkaMain.run(DoctorKafkaMain.java:80)
! at com.pinterest.doctorkafka.DoctorKafkaMain.run(DoctorKafkaMain.java:38)
! at io.dropwizard.cli.EnvironmentCommand.run(EnvironmentCommand.java:43)
! at io.dropwizard.cli.ConfiguredCommand.run(ConfiguredCommand.java:87)
! at io.dropwizard.cli.Cli.run(Cli.java:78)
! at io.dropwizard.Application.run(Application.java:93)
! at com.pinterest.doctorkafka.DoctorKafkaMain.main(DoctorKafkaMain.java:149)
INFO [2019-04-18 10:26:36,903] org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=kafka-poc-2.com:2181,kafka-poc-3.com:2181,kafka-poc-4.com:2181 sessionTimeout=30000 watcher=org.I0Itec.zkclient.ZkClient@3586412d
INFO [2019-04-18 10:26:36,904] org.I0Itec.zkclient.ZkEventThread: Starting ZkClient event thread.
INFO [2019-04-18 10:26:36,904] org.I0Itec.zkclient.ZkClient: Waiting for keeper state SyncConnected
INFO [2019-04-18 10:26:36,908] org.apache.zookeeper.ClientCnxn: Opening socket connection to server kafka-poc-3.com/192.168.100.7:2181. Will not attempt to authenticate using SASL (unknown error)
INFO [2019-04-18 10:26:36,909] org.apache.zookeeper.ClientCnxn: Socket connection established to kafka-poc-3.com/192.168.100.7:2181, initiating session
INFO [2019-04-18 10:26:37,249] com.twitter.ostrich.stats.LatchedStatsListener$$anon$1: Starting LatchedStatsListener
INFO [2019-04-18 10:26:37,520] com.twitter.ostrich.admin.AdminHttpService: Admin HTTP interface started on port 2052.
INFO [2019-04-18 10:26:37,533] io.dropwizard.server.ServerFactory: Starting DoctorKafkaMain
INFO [2019-04-18 10:26:37,620] org.eclipse.jetty.setuid.SetUIDListener: Opened application@ea00de{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
.............
INFO [2019-04-18 10:26:39,226] org.apache.kafka.common.utils.AppInfoParser: Kafka version : 1.1.0
INFO [2019-04-18 10:26:39,226] org.apache.kafka.common.utils.AppInfoParser: Kafka commitId : fdcf75ea326b8e07
15:56:39.233 [pool-8-thread-1] ERROR com.pinterest.doctorkafka.DoctorKafka - No brokerstats info for cluster kafka-poc-2.com:2181,kafka-poc-3.com:2181,kafka-poc-4.com:2181
INFO [2019-04-18 10:26:39,242] org.apache.kafka.clients.consumer.internals.AbstractCoordinator: [Consumer clientId=consumer-5, groupId=operator_brokerstats_group_sidharth-kafka-poc-1] Successfully joined group with generation 7
INFO [2019-04-18 10:26:39,243] org.apache.kafka.clients.consumer.internals.ConsumerCoordinator: [Consumer clientId=consumer-5, groupId=operator_brokerstats_group_sidharth-kafka-poc-1] Setting newly assigned partitions [brokerstats-0, brokerstats-1, brokerstats-2]

dropwizard_yaml_file.yaml
config: doctorkafka.properties
server:
type: default
maxThreads: 1024

doctorkafka.properties
###################################################################################

DoctorKafka global settings, including the zookeeper url for kafkastats topic

###################################################################################

[required] zookeeper quorum for storing doctorkafka metadata

#doctorkafka.zkurl=zookeeper001:2181,zookeeper002:2181,zookeeper003:2181
doctorkafka.zkurl=kafka-poc-2.com:2181,kafka-poc-3.com:2181,kafka-poc-4.com:2181

[required] zookeeper connection string for kafkastats topic

#doctorkafka.brokerstats.zkurl=zookeeper001:2181,zookeeper002:2181,zookeeper003:2181/cluster1
doctorkafka.brokerstats.zkurl=kafka-poc-2.com:2181,kafka-poc-3.com:2181,kafka-poc-4.com:2181

[required] kafka topic name for kafkastats

doctorkafka.brokerstats.topic=brokerstats

[required] the time window that doctorkafka uses to compute

doctorkafka.brokerstats.backtrack.seconds=86400

[optional] ssl related setting for the kafka cluster that hosts brokerstats

can PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL

doctorkafka.brokerstats.consumer.security.protocol=SSL
doctorkafka.brokerstats.consumer.ssl.client.auth=required
doctorkafka.brokerstats.consumer.ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1
doctorkafka.brokerstats.consumer.ssl.endpoint.identification.algorithm=HTTPS
doctorkafka.brokerstats.consumer.ssl.key.password=key_password
doctorkafka.brokerstats.consumer.ssl.keystore.location=keystore_path
doctorkafka.brokerstats.consumer.ssl.keystore.password=keystore_password
doctorkafka.brokerstats.consumer.ssl.keystore.type=JKS
doctorkafka.brokerstats.consumer.ssl.secure.random.implementation=SHA1PRNG
doctorkafka.brokerstats.consumer.ssl.truststore.location=truststore_path
doctorkafka.brokerstats.consumer.ssl.truststore.password=truststore_password
doctorkafka.brokerstats.consumer.ssl.truststore.type=JKS

[required] zookeeper connection string for action_report topic

#doctorkafka.action.report.zkurl=zookeeper001:2181,zookeeper002:2181,zookeeper003:2181/cluster1
doctorkafka.action.report.zkurl=kafka-poc-2.com:2181,kafka-poc-3.com:2181,kafka-poc-4.com:2181

[required] kafka topics for storing the actions that doctorkafka takes.

doctorkafka.action.report.topic=operator_report

[optional] broker replacement interval in seconds

doctorkafka.action.broker_replacement.interval.seconds=43200

[optional] broker replacement script

doctorkafka.action.broker_replacement.command="/usr/local/bin/ec2-replace-node.py -r "

[optional] ssl related settings for action report producer

doctorkafka.action.producer.security.protocol=SSL
doctorkafka.action.producer.ssl.client.auth=required
doctorkafka.action.producer.ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1
doctorkafka.action.producer.ssl.endpoint.identification.algorithm=HTTPS
doctorkafka.action.producer.ssl.key.password=key_password
doctorkafka.action.producer.ssl.keystore.location=keystore_path
doctorkafka.action.producer.ssl.keystore.password=keystore_password
doctorkafka.action.producer.ssl.keystore.type=JKS
doctorkafka.action.producer.ssl.secure.random.implementation=SHA1PRNG
doctorkafka.action.producer.ssl.truststore.location=truststore_path
doctorkafka.action.producer.ssl.truststore.password=truststore_password
doctorkafka.action.producer.ssl.truststore.type=JKS

[required] doctorkafka web port

doctorkafka.web.port=8080

[optional] disable doctorkafka service restart

doctorkafka.restart.disabled=false

[required] doctorkafka service restart interval

doctorkafka.restart.interval.seconds=86400

[optional] ostrich port

doctorkafka.ostrich.port=2052

[optional] tsd host and port.

#doctorkafka.tsd.hostport=localhost:18621

[required] email addresses for sending general notification on cluster under-replication etc.

doctorkafka.emails.notification=email_address_1,email_address_2

[required] email addresses for sending alerts to

doctorkafka.emails.alert=email_address_3,email_address_4

[optional] brokerstats.version

doctorkafka.brokerstats.version=0.2.4.5

################################################################################

cluster1 settings

[required] whether DoctorKakfa runs in the dry run mode.

kafkacluster.cluster1.dryrun=true

[required] zookeeper url for the kafka cluster

#kafkacluster.cluster1.zkurl=zookeeper001:2181,zookeeper002:2181,zookeeper003:2181/cluster1
kafkacluster.cluster1.zkurl=kafka-poc-2.com:2181,kafka-poc-3.com:2181,kafka-poc-4.com:2181

[required] the network-inbound limit in megabytes

kafkacluster.cluster1.network.inbound.limit.mb=35

[required] the network-outbound limit in megabytes

kafkacluster.cluster1.network.outbound.limit.mb=80

[required] the broker's maximum network bandwidth

kafkacluster.cluster1.network.bandwith.max.mb=150

[optional] ssl related kafka consumer setting for accessing topic metadata info of cluster1

kafkacluster.cluster1.consumer.security.protocol=SSL
kafkacluster.cluster1.consumer.ssl.client.auth=required
kafkacluster.cluster1.consumer.ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1
kafkacluster.cluster1.consumer.ssl.endpoint.identification.algorithm=HTTPS
kafkacluster.cluster1.consumer.ssl.key.password=key_password
kafkacluster.cluster1.consumer.ssl.keystore.location=keystore_path
kafkacluster.cluster1.consumer.ssl.keystore.password=keystore_password
kafkacluster.cluster1.consumer.ssl.keystore.type=JKS
kafkacluster.cluster1.consumer.ssl.secure.random.implementation=SHA1PRNG
kafkacluster.cluster1.consumer.ssl.truststore.location=truststore_path
kafkacluster.cluster1.consumer.ssl.truststore.password=truststore_password
kafkacluster.cluster1.consumer.ssl.truststore.type=JKS

image

Concurrency issue

When I do a largish number of HTTP GET requests against /servlet/clusterinfo on a largish DoctorKafka config, I occasionally see the following error logged:

java.util.ConcurrentModificationException: KafkaConsumer is not safe for multi-threaded access
	at org.apache.kafka.clients.consumer.KafkaConsumer.acquire(KafkaConsumer.java:1824) ~[doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.apache.kafka.clients.consumer.KafkaConsumer.acquireAndEnsureOpen(KafkaConsumer.java:1808) ~[doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.apache.kafka.clients.consumer.KafkaConsumer.listTopics(KafkaConsumer.java:1524) ~[doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at com.pinterest.doctorkafka.KafkaClusterManager.toJson(KafkaClusterManager.java:146) ~[doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at com.pinterest.doctorkafka.servlet.ClusterInfoServlet.renderJSON(ClusterInfoServlet.java:43) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at com.pinterest.doctorkafka.servlet.DoctorKafkaServlet.doGet(DoctorKafkaServlet.java:130) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:687) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:867) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1623) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at io.dropwizard.servlets.ThreadNameFilter.doFilter(ThreadNameFilter.java:35) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at io.dropwizard.jersey.filter.AllowedMethodsFilter.handle(AllowedMethodsFilter.java:45) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at io.dropwizard.jersey.filter.AllowedMethodsFilter.doFilter(AllowedMethodsFilter.java:39) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at com.codahale.metrics.jetty9.InstrumentedHandler.handle(InstrumentedHandler.java:239) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at io.dropwizard.jetty.RoutingHandler.handle(RoutingHandler.java:52) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:174) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.server.Server.handle(Server.java:502) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683) [doctorkafka-0.2.4.3-jar-with-dependencies.jar:?]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_172]

0.2.4.2 kafkastats missing com/fasterxml/jackson/annotation/JsonMerge

Running the jar from https://repo.maven.apache.org/maven2/com/github/pinterest/kafkastats/0.2.4.2/kafkastats-0.2.4.2-jar-with-dependencies.jar, I get the following error:

Exception in thread "main" java.lang.NoClassDefFoundError: com/fasterxml/jackson/annotation/JsonMerge
	at com.fasterxml.jackson.databind.introspect.JacksonAnnotationIntrospector.<clinit>(JacksonAnnotationIntrospector.java:50)
	at com.fasterxml.jackson.databind.ObjectMapper.<clinit>(ObjectMapper.java:291)
	at kafka.utils.Json$.<init>(Json.scala:30)
	at kafka.utils.Json$.<clinit>(Json.scala)
	at kafka.zk.BrokerIdZNode$.decode(ZkData.scala:193)
	at kafka.utils.ZkUtils.parseBrokerJson(ZkUtils.scala:708)
	at kafka.utils.ZkUtils.getBrokerInfo(ZkUtils.scala:871)
	at kafka.utils.ZkUtils.$anonfun$getAllBrokersInCluster$2(ZkUtils.scala:280)
	at kafka.utils.ZkUtils.$anonfun$getAllBrokersInCluster$2$adapted(ZkUtils.scala:280)
	at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:234)
	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:52)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
	at scala.collection.TraversableLike.map(TraversableLike.scala:234)
	at scala.collection.TraversableLike.map$(TraversableLike.scala:227)
	at scala.collection.AbstractTraversable.map(Traversable.scala:104)
	at kafka.utils.ZkUtils.getAllBrokersInCluster(ZkUtils.scala:280)
	at com.pinterest.doctorkafka.util.OperatorUtil.getBrokers(OperatorUtil.java:214)
	at com.pinterest.doctorkafka.util.OperatorUtil.createKafkaProducerProperties(OperatorUtil.java:226)
	at com.pinterest.doctorkafka.stats.KafkaAvroPublisher.<init>(KafkaAvroPublisher.java:60)
	at com.pinterest.doctorkafka.stats.KafkaStatsMain.main(KafkaStatsMain.java:132)
Caused by: java.lang.ClassNotFoundException: com.fasterxml.jackson.annotation.JsonMerge
	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 21 more

Why need tsdb

Hi Team,
i am trying to use kafkastats before that i have some question .

  1. Why we need tsdb ?
  2. what is ostrichport ?

Slow stats collection outside of AWS

All my Kafka brokers in AWS have no problems meeting the 30 second polling interval for kafkastats. However, all of the brokers on physical hardware show crazy intervals between metric publishing events.

2019-01-16 18:22:20.856 [StatsReporter] INFO  com.pinterest.doctorkafka.stats.BrokerStatsReporter - published to kafka : {"timestamp": 1547662170130, "id": 1441
2019-01-16 18:43:50.866 [StatsReporter] INFO  com.pinterest.doctorkafka.stats.BrokerStatsReporter - published to kafka : {"timestamp": 1547663459364, "id": 1441
2019-01-16 18:56:41.630 [StatsReporter] INFO  com.pinterest.doctorkafka.stats.BrokerStatsReporter - published to kafka : {"timestamp": 1547664230872, "id": 1441
2019-01-16 19:09:32.229 [StatsReporter] INFO  com.pinterest.doctorkafka.stats.BrokerStatsReporter - published to kafka : {"timestamp": 1547665001633, "id": 1441
2019-01-16 19:22:22.797 [StatsReporter] INFO  com.pinterest.doctorkafka.stats.BrokerStatsReporter - published to kafka : {"timestamp": 1547665772231, "id": 1441
2019-01-16 19:44:07.842 [StatsReporter] INFO  com.pinterest.doctorkafka.stats.BrokerStatsReporter - published to kafka : {"timestamp": 1547667076506, "id": 1441
2019-01-16 19:56:58.609 [StatsReporter] INFO  com.pinterest.doctorkafka.stats.BrokerStatsReporter - published to kafka : {"timestamp": 1547667847848, "id": 1441
2019-01-16 20:09:49.174 [StatsReporter] INFO  com.pinterest.doctorkafka.stats.BrokerStatsReporter - published to kafka : {"timestamp": 1547668618610, "id": 1441
2019-01-16 20:22:39.883 [StatsReporter] INFO  com.pinterest.doctorkafka.stats.BrokerStatsReporter - published to kafka : {"timestamp": 1547669389176, "id": 1441
2019-01-16 20:44:25.059 [StatsReporter] INFO  com.pinterest.doctorkafka.stats.BrokerStatsReporter - published to kafka : {"timestamp": 1547670693733, "id": 1441
2019-01-16 20:57:15.843 [StatsReporter] INFO  com.pinterest.doctorkafka.stats.BrokerStatsReporter - published to kafka : {"timestamp": 1547671465064, "id": 1441
2019-01-16 21:10:06.384 [StatsReporter] INFO  com.pinterest.doctorkafka.stats.BrokerStatsReporter - published to kafka : {"timestamp": 1547672235845, "id": 1441

The brokers are averaging 90% idle, with reasonable amounts of free memory. Any ideas where I should be looking to see what it's trying to do?

Metrics Collector fails to return brokers on localhost

Hi

I have been struggling to get the Kafka metrics collector to start on broker. I have Kafka cluster running locally with 3 brokers on localhost (on ports 9092-9094)

This is the command I use to run stored in start.sh file.

java -server \
     -Dlog4j.configurationFile=file:../config/log4j2.xml \
     -cp lib/*:kafkastats-0.2.4.9.jar \
     com.pinterest.doctorkafka.stats.KafkaStatsMain \
     -broker 127.0.0.1:9092 \
     -jmxport 9999 \
     -topic brokerstats \
     -zookeeper 127.0.0.1:2181/cluster1 \
     -uptimeinseconds 3600 \
     -pollingintervalinseconds 60 \
     -ostrichport 2051 \
     -tsdhostport localhost:18126 \
     -kafka_config ~/oss/kafka/config/server1.properties \
     -producer_config ~/oss/kafka/config/producer1.properties \
     -primary_network_ifacename eth0
sudo ./start.sh
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
zkUrl:127.0.0.1:2181/cluster1
secPro:PLAINTEXT
brokerStr:null
Exception in thread "main" java.lang.NullPointerException
        at java.util.Hashtable.put(Hashtable.java:460)
        at com.pinterest.doctorkafka.util.OperatorUtil.createKafkaProducerProperties(OperatorUtil.java:239)
        at com.pinterest.doctorkafka.stats.KafkaAvroPublisher.<init>(KafkaAvroPublisher.java:62)
        at com.pinterest.doctorkafka.stats.KafkaStatsMain.main(KafkaStatsMain.java:135)

The list of brokers is empty. When I use zookeeper shell directly I retrieve 3 active brokers just fine.

bin/zookeeper-shell.sh localhost:2181
ls /brokers/ids
[1, 2, 3]

Can someone help with what I am doing wrong here?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.