This repository contains tools used for generating synthetic Apache logs and the tools needed to parse reference empirical logs, see paper "On Automatic Parsing of Log Records" for details.
For the Apache Fake Log Generator tool, please see the ./generator/
folder. A complete description of how to use the tool in the ./generator/README.md
file.
For the tool used to parse real logs VA, VB, and VC , and convert them into a format ingestible by a machine learning model, please see the ./real_log_cleaner
folder. The origins of the logs are as follows: VA, VB, and VC. Additional details are given in ./real_log_cleaner/README.md
file.
To view all of the sample log files used (including the three real log files, as well as the five generated log files mentioned in the paper), please visit the data repository.
The details of the tool and the data are given in a preprint. The final version of the paper was published in proceedings of the International Conference on Software Engineering (ICSE’21); you can see the recording of the presentation here. Please cite the tool and data as
@INPROCEEDINGS{rand2021log,
author={Rand, Jared and Miranskyy, Andriy},
booktitle={2021 IEEE/ACM 43rd International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)},
title={{On Automatic Parsing of Log Records}},
year={2021},
pages={41-45},
doi={10.1109/ICSE-NIER52604.2021.00017}
}
This project is licensed under the MIT License.
If you have found a bug or came up with a new feature -- please open an issue or pull request.
This work was supported and funded by Ryerson University and Natural Sciences and Engineering Research Council of Canada.