This is an AWS Lambda function that ships logs from AWS services to Logz.io.
Note: This project contains code for Python 2 and Python 3. We urge you to use Python 3 because Python 2.7 will reach end of life on January 1, 2020.
AWS Lambda function that ships Cloudwatch Logs to logz.io
License: Apache License 2.0
This is an AWS Lambda function that ships logs from AWS services to Logz.io.
Note: This project contains code for Python 2 and Python 3. We urge you to use Python 3 because Python 2.7 will reach end of life on January 1, 2020.
The GzipLogRequest does not increment the self._decompress_size
variable and if there are a large number of records in the batch it can cause the upload to fail with a 413 https://docs.logz.io/shipping/log-sources/json-uploads.html#request-entity-too-large.
You can fix this by updating the write
method (https://github.com/logzio/logzio_aws_serverless/blob/master/python3/shipper/shipper.py#L49) of the GzipLogRequest class.
def write(self, log):
bytes_to_write = bytes("\n" + log, 'utf-8') if self._logs_counter else bytes(log, 'utf-8')
self._writer.write(bytes_to_write)
self._decompress_size += sys.getsizeof(bytes_to_write)
self._logs_counter += 1
When logging from JS code and the message contains [
and ]
(such as when logging JSON that contains arrays) the log message is not being parsed nor its JSON content.
Repro:
Create a JS lambda with the following code:
module.exports.handler = async (event, context) => {
console.log('Just a message')
console.log(JSON.stringify({
message: 'just a JSON message'
}))
console.log(JSON.stringify({
message: 'A message and array',
array: ['with', 'data'],
}))
console.log(`Message with [brackets]`)
return {
statusCode: 200
}
}
We use the cloudwatch shipper.
We enabled lambda insights.
We started getting a lot of logzio-index-failure
in our logs.
The index-failed-reason is
{"type":"mapper_parsing_exception","reason":"failed to parse field [@timestamp] of type [date] in document with id 'REDACTED'. Preview of field's value: 'EXTENSION'","caused_by":{"type":"illegal_argument_exception","reason":"failed to parse date field [EXTENSION] with format [strict_d...
The reason the "@timestamp" field is incorrectly populated seems due to this code
Lambda insights logs seem to use a different format from normal logs
EXTENSION Name: cloudwatch_lambda_agent State: Ready Events: [INVOKE,SHUTDOWN]
These logs don't seem very valuable, so I'd imagine they should never be forwarded to logzio in the first place.
Being from Europe, I've used the https://listener-eu.logz.io:8071 endpoint
With surprising error result 401: Logging token is not valid
Either both endpoints needs to be synced on tokens or if European endpoint is not valid, documentation needs to be updated.
Would it be possible to push an official version to SAR?
This makes it super-easy to integrate into Cloudformation, because everyone can then just reference it directly in an existing template, which saves doing a deploy across all your accounts and regions.
If not, I'll probably just do a local build and push to private SAR.
Hi,
With Python 2.7 going EOL next 1st Jan 2020, is there any plans to migrate to Python 3.X or Go?
Kind regards,
Dan
Hey.
Could you guys maybe create tag
s for released version?
Also, ideally you could have release artifacts that are would be more 'turn key', rather than us having to copy files around before zipping.
It makes it pretty cumbersome to consume these as IaC.
Any improvement here would be great. For us, a serverless framework plugin would be ideal.
Thanks
I believe there might be an error here when you're parsing the AWS Lambda logs
You have this on line number 58 in lambda_function code
if len(message_parts) == 3:
log['@timestamp'] = message_parts[0]
log['requestID'] = message_parts[1]
log['message'] = message_parts[2]
which I think should be
if len(message_parts) == 4:
log['@timestamp'] = message_parts[0]
log['requestID'] = message_parts[1]
log['logLevel'] = message_parts[2]
log['message'] = message_parts[3]
I also think that you were trying to ignore the lines in AWS Lambda logs that start with START, END, & REPORT but they're still getting through.
A useful feature for the lambda is the ability to enrich the logs with custom metrics.
A use case is the usage of multiple AWS accounts for the different environments (dev, test, UAT, production), but the AWS services at the time of logging are not aware of it.
When we ship the logs to Logz.io, it will be handy to add custom properties, such as environment: testing
, which will make the querying process easier.
As an implementation, the lambda could leverage the variables, which are transformed into environment variables at runtime. It could be implemented as a single variable: properties_to_enrich: environment=testing; foo=bar
; which will produce properties called environment
with value testing
and foo
with value bar
.
When cloudwatch's log message contains JSON list type message, it will throw (raise) the following exception:
[ERROR] AttributeError: 'list' object has no attribute 'items' Traceback (most recent call last): File "/var/task/lambda_function.py", line 142, in lambda_handler if _parse_cloudwatch_log(log, additional_data): File "/var/task/lambda_function.py", line 92, in _parse_cloudwatch_log _parse_to_json(log) File "/var/task/lambda_function.py", line 77, in _parse_to_json for key, value in json_object.items():
https://github.com/logzio/logzio_aws_serverless/blob/master/python3/cloudwatch/src/lambda_function.py#L72-L80
This exception is not caught since the exception handler is only looking for specific Error types.
This can only happen for any valid JSON payload that is not an object. I think that's only List ([...]
).
Since Error is not caught, the process will exit thus all aws_logs_data
that was consumed is likely not processed / shipped. This causes an even bigger blast radius if the user of the log-shipper is shipping logs from multiple logGroups.
IMHO, the try/except block should be handled at _parse_cloud_watch_log()
, which seems to be the method that handles each log.
https://github.com/logzio/logzio_aws_serverless/blob/master/python3/cloudwatch/src/lambda_function.py#L144-L149
https://github.com/logzio/logzio_aws_serverless/blob/master/python3/cloudwatch/src/lambda_function.py#L83-L93
I also think list type should be parsed correctly, but since I'm aware of how the ingestion side process the data, this might be much harder than I imagine.
Hi
We are using the shipper.py code in our lambda with a timeout of 60 seconds to ship logs to lambda.
We see on very very rare occassions that 60 seconds (or more) pass , between the time the request is sent without returning from the request , and then our lambda timed out.
We dont see any retry logs , so it looks like the first call to logzio doesnt return after 60 seconds , in the shipper.py
request = urllib.request.Request(self._logzio_url, data=self._logs.bytes(),
headers=self._logs.http_headers())
return urllib.request.urlopen(request)
The specific amount of data sent in this specific lambda call which timed out was relatively small so it not a size issue.
Questions:
Any help would be great .
Thanks.
3.206 Line
A sequence of zero or more non- characters plus a terminating character.
However, in shipper.py, a \n
is added only between "lines" and not after every line. For example:
line1\nline2\nline3
This is not POSIX compliant and increases the complexity of the code (requires a counter and an additional if):
logzio_aws_serverless/python3/shipper/shipper.py
Lines 48 to 50 in c1cf511
Assuming logz processes POSIX-compliant lines, the body of write(self, log)
should be changed to something like
self._writer.write(bytes(log+"\n", 'utf-8'))
self._logs_counter += 1
The counter needs to be kept for __len__(self)
only.
The GzipLogRequest
class is using a gzip.GzipFile
class backed by an io.BytesIO
stream:
logzio_aws_serverless/python3/shipper/shipper.py
Lines 38 to 39 in c1cf511
The GzipLogRequest.close()
method closes the GzipFile
:
logzio_aws_serverless/python3/shipper/shipper.py
Lines 69 to 70 in c1cf511
However, as per class gzip.GzipFile:
Calling a GzipFile object’s close() method does not close fileobj
So I believe the GzipLogRequest.close()
should also call self._logs.close()
.
When loading shipper into lambda and following the guide as described, the following error is recieved when tested, and the logs aren't being pulled through as expected:
{
"errorMessage": "Unable to import module 'lambda_function'"
}
START RequestId: e1cd6cb4-d3a9-11e8-b091-fb68b51def34 Version: $LATEST
Unable to import module 'lambda_function': No module named shipper
END RequestId: e1cd6cb4-d3a9-11e8-b091-fb68b51def34
REPORT RequestId: e1cd6cb4-d3a9-11e8-b091-fb68b51def34 Duration: 0.96 ms Billed Duration: 100 ms Memory Size: 512 MB Max Memory Used: 18 MB
The default value for FORMAT is 'text'. This results in errors parsing the events.
The default value should be 'json'.
Current lambda results in the following error when configured as instructed. Removing line #39 resolves the error
'memory_limit_in_mb': KeyError
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 77, in lambda_handler
_parse_cloudwatch_log(log, aws_logs_data)
File "/var/task/lambda_function.py", line 39, in _parse_cloudwatch_log
log['memory_limit_in_mb'] = aws_logs_data['memory_limit_in_mb']
KeyError: 'memory_limit_in_mb'
How do I have to log the messages in a node.js lambda function so the shipper will handle it as json?
currently when I console.log(logObj)
this will land in the cloudwatch and logz.io logs:
2018-08-15T07:37:07.085Z 02af8f7c-a05e-11e8-acf8-1d43011671d3 { version: 'dev', .... }
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.