Giter VIP home page Giter VIP logo

amazon-redshift-monitoring's People

Contributors

enkeboll avatar frankfarrell avatar hyandell avatar ianmeyers avatar jaskirat avatar javierros avatar tomdaly avatar vintageplayer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

amazon-redshift-monitoring's Issues

Lambda function timing out while publishing cloudwatch metrics

I deployed CloudWatch using SAM and I can see the data being fetched from Redshift Cluster within 1 minute, however the Lambda function still doesn't complete after running for 5 minutes,.

Lambda logs report:
Executing Redshift Diagnostic Query: WLMQuerySlotCountWarning
Publishing 24 CloudWatch Metrics
END RequestId: b054f912-b614-11e8-aa9e-d5eb851e7827
REPORT RequestId: b054f912-b614-11e8-aa9e-d5eb851e7827 Duration: 300005.52 ms Billed Duration: 300000 ms Memory Size: 192 MB Max Memory Used: 32 MB
2018-09-11T22:52:41.248Z b054f912-b614-11e8-aa9e-d5eb851e7827 Task timed out after 300.01 seconds

Cloudwatch Log report:


22:47:42
Executing Redshift Diagnostic Query: WLMQuerySlotCountWarning

22:47:42
Publishing 24 CloudWatch Metrics
Publishing 24 CloudWatch Metrics

22:52:41
END RequestId: b054f912-b614-11e8-aa9e-d5eb851e7827
END RequestId: b054f912-b614-11e8-aa9e-d5eb851e7827

22:52:41
REPORT RequestId: b054f912-b614-11e8-aa9e-d5eb851e7827 Duration: 300005.52 ms Billed Duration: 300000 ms Memory Size: 192 MB Max Memory Used: 32 MB

22:52:41
2018-09-11T22:52:41.248Z b054f912-b614-11e8-aa9e-d5eb851e7827 Task timed out after 300.01 seconds

22:52:41
Pushing metrics to CloudWatch failed: exception ('Connection aborted.', error(1, 'Operation not permitted'))

22:52:41
/var/task/redshift_monitoring.py:249: SyntaxWarning: name 'debug' is assigned to before global declaration

22:52:41
global debug

Can't create Change Set

Hi there,
I'm trying to set this up using the links you provide below, but I can't complete the setup. I get a message saying:
Check the following transforms: ["AWS::Serverless-2016-10-31"] You must use a change set to create this stack because it includes one or more transforms.

But when I click the Create Change Set button, nothing happens and Execute is still greyed out.

Am I missing a step?

-- joe.

No module named enum

I deployed the v1.5 zip which had the fix for pgpass library and it seems like i am seeing another module being missing.

17:15:47
global debug

17:15:47
START RequestId: 51a926fe-5e2d-43a0-adfd-4e8e022e016b Version: $LATEST

17:15:47
Unable to import module 'lambda_function': No module named enum

17:15:47
END RequestId: 51a926fe-5e2d-43a0-adfd-4e8e022e016b

17:15:47
REPORT RequestId: 51a926fe-5e2d-43a0-adfd-4e8e022e016b Duration: 0.50 ms Billed Duration: 100 ms Memory Size: 192 MB Max Memory Used: 32 MB

17:16:44
START RequestId: 51a926fe-5e2d-43a0-adfd-4e8e022e016b Version: $LATEST

17:16:44
Unable to import module 'lambda_function': No module named enum

17:16:44
END RequestId: 51a926fe-5e2d-43a0-adfd-4e8e022e016b

17:16:44
REPORT RequestId: 51a926fe-5e2d-43a0-adfd-4e8e022e016b Duration: 0.68 ms Billed Duration: 100 ms Memory Size: 192 MB Max Memory Used: 32 MB

17:18:35
START RequestId: 51a926fe-5e2d-43a0-adfd-4e8e022e016b Version: $LATEST

17:18:35
Unable to import module 'lambda_function': No module named enum

17:18:35
END RequestId: 51a926fe-5e2d-43a0-adfd-4e8e022e016b

17:18:35
REPORT RequestId: 51a926fe-5e2d-43a0-adfd-4e8e022e016b Duration: 0.52 ms Billed Duration: 100 ms Memory Size: 192 MB Max Memory Used: 32 MB

Support for IAM-based authentication

Are there plans to update this project with an option to avoid encrypting a password with KMS and instead relying on the IAM role attached to the Lambda function to provide authentication to Redshift?

Can't use password without encrypting with KMS

The section in redshift_monitoring.py that handles passwords checks for an unencrypted password and then sets the password to None. Because of this, even if an unencrypted password exists, it will never be used.

Need to change the flow of logic here so that the unencrypted password can be used.

No LAMBDA

Non-Lambda version please? We're in Austrlia, managing clients with Redhisft cluster but Lambda is not available here yet.

Cannot work with Redshift version 1.0.18788

The metrics cannot be generated with version 1.0.18788. The possible cause might be connection failure. Here is the stack trace:

The read operation timed out: timeout
Traceback (most recent call last):
  File "/var/task/lambda_function.py", line 15, in lambda_handler
    redshift_monitoring.monitor_cluster(config_sources)
  File "/var/task/redshift_monitoring.py", line 329, in monitor_cluster
    put_metrics.extend(gather_service_class_stats(cursor, cluster))
  File "/var/task/redshift_monitoring.py", line 124, in gather_service_class_stats
    ''')
  File "/var/task/redshift_monitoring.py", line 98, in run_command
    cursor.execute(statement)
  File "/var/task/lib/pg8000/core.py", line 861, in execute
    self._c.execute(self, operation, args)
  File "/var/task/lib/pg8000/core.py", line 1909, in execute
    self.handle_messages(cursor)
  File "/var/task/lib/pg8000/core.py", line 1972, in handle_messages
    code, data_len = ci_unpack(self._read(5))
  File "/var/lang/lib/python3.6/socket.py", line 586, in readinto
    return self._sock.recv_into(b)
  File "/var/lang/lib/python3.6/ssl.py", line 1012, in recv_into
    return self.read(nbytes, buffer)
  File "/var/lang/lib/python3.6/ssl.py", line 874, in read
    return self._sslobj.read(len, buffer)
  File "/var/lang/lib/python3.6/ssl.py", line 631, in read
    v = self._sslobj.read(len, buffer)
socket.timeout: The read operation timed out

In comparison, we have another cluster with the latest version (1.0.18861) which generated and updated the metrics successfully. Based on the comparison, we assume that the Redshift version might influence the project.

NOTE
We modified the code to make timeout as an input parameter. The code changes are listed below:
redshift-monitoring.py

line 262      #Add timeout to config_sources
line 263     timeout = int(get_config_value(['TimeOut', 'time_out', 'timeOut'], config_sources))
..........
line 305     conn = pg8000.connect(database=database, user=user, password=pwd, host=host, port=port, ssl=ssl, timeout=timeout)

The input JSON of the cloud watch event rule:
{ "DbUser": "xxxxxxx", "EncryptedPassword": "**************", "ClusterName": "xxxxxxxxxx", "HostName": "xxxxxxxxxxxxxx", "HostPort": "xxxx", "DatabaseName": "xxxxxxxxxx", "AggregationInterval": "1 hour", "TimeOut": "20" }

P.S

  • Project version: 1.7
  • Deployment method: manually create AWS resources and upload zip onto Lambda function

no new metrics after 3 days?

Hi again - I've got the function successfully running hourly, but I don't see any new metrics resulting from it. Here's an error reported in the logs that might be related:

(u'ERROR', u'42P01', u'relation "sensor_data" does not exist', u'/home/ec2-user/padb/src/pg/src/backend/catalog/namespace.c', u'237', u'RangeVarGetRelid'): ProgrammingError
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 328, in lambda_handler
put_metrics.extend(run_external_commands('User Configured', 'user-queries.json', cursor, cluster))
File "/var/task/lambda_function.py", line 93, in run_external_commands
interval = run_command(cursor, command['query'])
File "/var/task/lambda_function.py", line 129, in run_command
cursor.execute(statement)
File "/var/task/lib/pg8000/core.py", line 852, in execute
self._c.execute(self, operation, args)
File "/var/task/lib/pg8000/core.py", line 1741, in execute
self.handle_messages(cursor)
File "/var/task/lib/pg8000/core.py", line 1879, in handle_messages
raise self.error
ProgrammingError: (u'ERROR', u'42P01', u'relation "sensor_data" does not exist', u'/home/ec2-user/padb/src/pg/src/backend/catalog/namespace.c', u'237', u'RangeVarGetRelid')

This error is reported at least twice for each Lambda run, amongst the series of diagnostic queries. Here's a screenshot. I do not see any new metrics for Redshift listed.

module initialization error: 'db_user'

I'm getting the error in the subject each time the function is invoked. Is there a way for me to better diagnose the issue? The function logs in CloudWatch aren't very verbose besides to point out the module error.

Password decrypt is timing out

I have setup the KMS key in the same region where the Redshift cluster resides. Encrypted the database user password using command line "aws kms encrypt --key-id $KEY_ID --plaintext ". Edited the lambda_function.py script to fill in the configurations where the enc_password field is set to the "CiphertextBlob" output from the above command line. Now when I am running a test on the lambda function the decrypt step is timing out. Any suggestion on why it is timing out would be appreciated.

Interval is not being used

Hi ,
It seems that the aggregation Interval is not used in the code .
I would like to have some monitors run in one interval (every 5 minutes for example)
and other monitors will run in a different ones (every hour)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.