awslabs / amazon-redshift-monitoring Goto Github PK
View Code? Open in Web Editor NEWAmazon Redshift Advanced Monitoring
License: Apache License 2.0
Amazon Redshift Advanced Monitoring
License: Apache License 2.0
I deployed CloudWatch using SAM and I can see the data being fetched from Redshift Cluster within 1 minute, however the Lambda function still doesn't complete after running for 5 minutes,.
Lambda logs report:
Executing Redshift Diagnostic Query: WLMQuerySlotCountWarning
Publishing 24 CloudWatch Metrics
END RequestId: b054f912-b614-11e8-aa9e-d5eb851e7827
REPORT RequestId: b054f912-b614-11e8-aa9e-d5eb851e7827 Duration: 300005.52 ms Billed Duration: 300000 ms Memory Size: 192 MB Max Memory Used: 32 MB
2018-09-11T22:52:41.248Z b054f912-b614-11e8-aa9e-d5eb851e7827 Task timed out after 300.01 seconds
Cloudwatch Log report:
22:47:42
Executing Redshift Diagnostic Query: WLMQuerySlotCountWarning
22:47:42
Publishing 24 CloudWatch Metrics
Publishing 24 CloudWatch Metrics
22:52:41
END RequestId: b054f912-b614-11e8-aa9e-d5eb851e7827
END RequestId: b054f912-b614-11e8-aa9e-d5eb851e7827
22:52:41
REPORT RequestId: b054f912-b614-11e8-aa9e-d5eb851e7827 Duration: 300005.52 ms Billed Duration: 300000 ms Memory Size: 192 MB Max Memory Used: 32 MB
22:52:41
2018-09-11T22:52:41.248Z b054f912-b614-11e8-aa9e-d5eb851e7827 Task timed out after 300.01 seconds
22:52:41
Pushing metrics to CloudWatch failed: exception ('Connection aborted.', error(1, 'Operation not permitted'))
22:52:41
/var/task/redshift_monitoring.py:249: SyntaxWarning: name 'debug' is assigned to before global declaration
22:52:41
global debug
Hi there,
I'm trying to set this up using the links you provide below, but I can't complete the setup. I get a message saying:
Check the following transforms: ["AWS::Serverless-2016-10-31"] You must use a change set to create this stack because it includes one or more transforms.
But when I click the Create Change Set button, nothing happens and Execute is still greyed out.
Am I missing a step?
-- joe.
I deployed the v1.5 zip which had the fix for pgpass library and it seems like i am seeing another module being missing.
17:15:47
global debug
17:15:47
START RequestId: 51a926fe-5e2d-43a0-adfd-4e8e022e016b Version: $LATEST
17:15:47
Unable to import module 'lambda_function': No module named enum
17:15:47
END RequestId: 51a926fe-5e2d-43a0-adfd-4e8e022e016b
17:15:47
REPORT RequestId: 51a926fe-5e2d-43a0-adfd-4e8e022e016b Duration: 0.50 ms Billed Duration: 100 ms Memory Size: 192 MB Max Memory Used: 32 MB
17:16:44
START RequestId: 51a926fe-5e2d-43a0-adfd-4e8e022e016b Version: $LATEST
17:16:44
Unable to import module 'lambda_function': No module named enum
17:16:44
END RequestId: 51a926fe-5e2d-43a0-adfd-4e8e022e016b
17:16:44
REPORT RequestId: 51a926fe-5e2d-43a0-adfd-4e8e022e016b Duration: 0.68 ms Billed Duration: 100 ms Memory Size: 192 MB Max Memory Used: 32 MB
17:18:35
START RequestId: 51a926fe-5e2d-43a0-adfd-4e8e022e016b Version: $LATEST
17:18:35
Unable to import module 'lambda_function': No module named enum
17:18:35
END RequestId: 51a926fe-5e2d-43a0-adfd-4e8e022e016b
17:18:35
REPORT RequestId: 51a926fe-5e2d-43a0-adfd-4e8e022e016b Duration: 0.52 ms Billed Duration: 100 ms Memory Size: 192 MB Max Memory Used: 32 MB
Are there plans to update this project with an option to avoid encrypting a password with KMS and instead relying on the IAM role attached to the Lambda function to provide authentication to Redshift?
The section in redshift_monitoring.py that handles passwords checks for an unencrypted password and then sets the password to None. Because of this, even if an unencrypted password exists, it will never be used.
Need to change the flow of logic here so that the unencrypted password can be used.
1.4 is still missing pgpasslib. Please update it.
Non-Lambda version please? We're in Austrlia, managing clients with Redhisft cluster but Lambda is not available here yet.
the URL for east-1 is incorrect so cant Launch the stack has changed..it
https://s3-us-east-1.amazonaws.com/awslabs-code-us-east-1/RedshiftAdvancedMonitoring/deploy-vpc.yaml
it should be
https://s3.amazonaws.com/awslabs-code-us-east-1/RedshiftAdvancedMonitoring/deploy-vpc.yaml
The metrics cannot be generated with version 1.0.18788. The possible cause might be connection failure. Here is the stack trace:
The read operation timed out: timeout
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 15, in lambda_handler
redshift_monitoring.monitor_cluster(config_sources)
File "/var/task/redshift_monitoring.py", line 329, in monitor_cluster
put_metrics.extend(gather_service_class_stats(cursor, cluster))
File "/var/task/redshift_monitoring.py", line 124, in gather_service_class_stats
''')
File "/var/task/redshift_monitoring.py", line 98, in run_command
cursor.execute(statement)
File "/var/task/lib/pg8000/core.py", line 861, in execute
self._c.execute(self, operation, args)
File "/var/task/lib/pg8000/core.py", line 1909, in execute
self.handle_messages(cursor)
File "/var/task/lib/pg8000/core.py", line 1972, in handle_messages
code, data_len = ci_unpack(self._read(5))
File "/var/lang/lib/python3.6/socket.py", line 586, in readinto
return self._sock.recv_into(b)
File "/var/lang/lib/python3.6/ssl.py", line 1012, in recv_into
return self.read(nbytes, buffer)
File "/var/lang/lib/python3.6/ssl.py", line 874, in read
return self._sslobj.read(len, buffer)
File "/var/lang/lib/python3.6/ssl.py", line 631, in read
v = self._sslobj.read(len, buffer)
socket.timeout: The read operation timed out
In comparison, we have another cluster with the latest version (1.0.18861) which generated and updated the metrics successfully. Based on the comparison, we assume that the Redshift version might influence the project.
NOTE
We modified the code to make timeout as an input parameter. The code changes are listed below:
redshift-monitoring.py
line 262 #Add timeout to config_sources
line 263 timeout = int(get_config_value(['TimeOut', 'time_out', 'timeOut'], config_sources))
..........
line 305 conn = pg8000.connect(database=database, user=user, password=pwd, host=host, port=port, ssl=ssl, timeout=timeout)
The input JSON of the cloud watch event rule:
{ "DbUser": "xxxxxxx", "EncryptedPassword": "**************", "ClusterName": "xxxxxxxxxx", "HostName": "xxxxxxxxxxxxxx", "HostPort": "xxxx", "DatabaseName": "xxxxxxxxxx", "AggregationInterval": "1 hour", "TimeOut": "20" }
P.S
Keep support for .pgpasslib
Hi again - I've got the function successfully running hourly, but I don't see any new metrics resulting from it. Here's an error reported in the logs that might be related:
(u'ERROR', u'42P01', u'relation "sensor_data" does not exist', u'/home/ec2-user/padb/src/pg/src/backend/catalog/namespace.c', u'237', u'RangeVarGetRelid'): ProgrammingError
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 328, in lambda_handler
put_metrics.extend(run_external_commands('User Configured', 'user-queries.json', cursor, cluster))
File "/var/task/lambda_function.py", line 93, in run_external_commands
interval = run_command(cursor, command['query'])
File "/var/task/lambda_function.py", line 129, in run_command
cursor.execute(statement)
File "/var/task/lib/pg8000/core.py", line 852, in execute
self._c.execute(self, operation, args)
File "/var/task/lib/pg8000/core.py", line 1741, in execute
self.handle_messages(cursor)
File "/var/task/lib/pg8000/core.py", line 1879, in handle_messages
raise self.error
ProgrammingError: (u'ERROR', u'42P01', u'relation "sensor_data" does not exist', u'/home/ec2-user/padb/src/pg/src/backend/catalog/namespace.c', u'237', u'RangeVarGetRelid')
This error is reported at least twice for each Lambda run, amongst the series of diagnostic queries. Here's a screenshot. I do not see any new metrics for Redshift listed.
Unable to deploy latest v1.5 in us-west-2, S3 bucket permissions/access denied.
Unable to import module 'lambda_function': No module named pgpasslib
I used the VPC template for us-east-1
I'm getting the error in the subject each time the function is invoked. Is there a way for me to better diagnose the issue? The function logs in CloudWatch aren't very verbose besides to point out the module error.
I have setup the KMS key in the same region where the Redshift cluster resides. Encrypted the database user password using command line "aws kms encrypt --key-id $KEY_ID --plaintext ". Edited the lambda_function.py script to fill in the configurations where the enc_password field is set to the "CiphertextBlob" output from the above command line. Now when I am running a test on the lambda function the decrypt step is timing out. Any suggestion on why it is timing out would be appreciated.
Hi ,
It seems that the aggregation Interval is not used in the code .
I would like to have some monitors run in one interval (every 5 minutes for example)
and other monitors will run in a different ones (every hour)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.