scribery / aushape Goto Github PK
View Code? Open in Web Editor NEWA library and a tool for converting audit logs to XML and JSON
Home Page: https://scribery.github.io/aushape/
License: GNU Lesser General Public License v2.1
A library and a tool for converting audit logs to XML and JSON
Home Page: https://scribery.github.io/aushape/
License: GNU Lesser General Public License v2.1
Some of the functions use assertions, some return AUSHAPE_RC_INVALID_ARGS, some assert that and other return values.
Decide on the rules and implement them.
Some functions in headers don't have an extern keyword in front of their declarations. Make sure they all do.
This requires controllable support for #8.
Implement a test verifying that the library can be built with and used.
Since one input record can affect several output records, put the raw records directly under the event layer. This has a nice benefit of removing the need to have "fields" container in JSON, and also simplifying record collectors.
Syslog is usually limited by message size. Find if we can support some output which lets us log (essentially unlimited) execve documents.
Make sure the conv code never assumes that there are only two formats possible and doesn't do something like:
if (format == AUSHAPE_FORMAT_XML) {
/* Format is XML */
} else {
/* *Assume* format is JSON */
}
Instead do this:
if (format == AUSHAPE_FORMAT_XML) {
/* Format is XML */
} else if (format == AUSHAPE_FORMAT_JSON) {
/* Format *is* JSON */
}
This way in case another format is added, there won't be a possibility of
output in mixed format.
As events can be sized arbitrarily by users specifying arbitrary indent sizes and long hostnames, it is not possible to guarantee a minimum event size (unless we calculate that based on other settings, which would be complicated to implement and use).
Therefore don't fail the assertion on failing to trim, but instead produce a warning somewhere or just ignore it.
Various errors can occur during conversion, such as unknown records/fields, invalid field/record format, unexpected duplicated record types, etc.
Since aushape is supposed to run reliably under auditd, and can't simply stop processing the log, it needs to handle and report those errors somewhere.
Output events which failed to parse as a special type of event, containing the raw records and description of the failure.
Generate both XML and JSON schemas from whatever data it is possible to extract from auditd source tree and auditd field registry.
Implement a command-line or configuration option to specify input log character encoding.
Convert the input to UTF-8 before processing and outputting to both JSON and XML.
Since JSON can't represent duplicate records in objects, and object arrays are hard to use in ElasticSearch, figure out what to do with repeated records of the same type in one event.
At the moment repeated execve records are stitched together. There are still other repeated record types: AVC (in permissive mode), PATH, and OBJ_PID (if signal is sent to multiple processes), at the least.
One option is to aggregate them, similarly to execve, but more complicated records would still have a problem of array of objects and ElasticSearch flattening.
Another option is to multiply events with repeated records, outputting each event with a single record from the sequence.
Third option is to simply output records in an array, but this will be hit hardest by ElasticSearch array flattening, and will be hard to access.
Ignore events consisting of a single EOE records, which are produced by auparse sometimes.
Provide an option to limit event size. Events exceeding the size can be replaced with an event with a special attribute saying event was truncated. This can be a good start. Later adaptive truncation can be implemented, such as truncating some records, perhaps with a separate record size limit, or truncating execve record argument list, also with a separate limit.
Make XML and JSON log header and trailer a (configurable) part of converter output.
At the moment raw representation is concatenated together with a newline at the end, which is hard to read when viewing human-oriented output.
Instead output each raw line in its own (array) element.
Check if it's possible to not pass the "first" argument to functions, instead adding appropriate separators outside of them.
Make converter be able to repeat an operation if it can be recovered (e.g. an output failure), and permanently stick to an error if it's not (e.g. an auparse error messing up its state).
Format function return value descriptions in comments according to doxygen documentation.
See http://www.stack.nl/~dimitri/doxygen/manual/commands.html#cmdreturn
Implement whatever is necessary to run under audispd.
Make converter differentiate between and act differently for continuous and discrete outputs. E.g. a file and a syslog output. Continuous outputs can receive data in whatever pieces, discrete outputs can receive only complete documents.
Instead of using copy-pasted (but small) macros to handle failures and error returns, define a global set and use them everywhere.
Consider switching to a global return code type, instead of using per-module return codes as planned earlier.
Consider improving formatting code structure. E.g. make an entity output code not care about entity separators, let the invoking code deal with that. Look for other logic failures.
At the moment aushape has a single executable, which can be used for both streaming audit log to syslog (and possibly other targets) and doing single-shot conversion. This results in a somewhat complicated interface, which might be confusing and difficult to understand for new users.
Consider making two separate programs using the same library: one for single-shot conversion, another for streaming.
The benefits can be simpler interface and clearer separation of purpose. The downside can be either inability to stream an already saved file, or having the interface complexity of the streaming program to be about the same.
Mask and divert invalid UTF-8 sequences in JSON output. Check how XML handles them and implement a similar scheme, if necessary.
Implement unit tests for the aushape_gbuf and aushape_conv.
Do not output the "node" field in records, as it is already present for the event as "host".
Consider having folding level 0 signify newlines between output documents.
Then documents can end and start on separate lines in files down to folding level 1.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.