dzzh / elfstatsd Goto Github PK

Linux daemon to aggregate data from web server logs in ELF format for Munin

Python 99.68% Shell 0.32%

elfstatsd's Introduction

Elfstatsd

Elfstatsd is a backend component for elfstats.

It is a daemon process used to parse the access logs of different HTTP servers (Apache, Tomcat, Varnish, etc.) and store the aggregated statistics in the report files. The extracted information can later be visualized with the monitoring tools (Munin is the only monitoring tool supported by now for visualization). Elfstatsd processes the logs in NCSA Common log format / Extended Log Format (ELF) and reports such metrics as the number of calls and slow calls, aggregated latencies (min, max, avg, percentiles), response codes distribution, the number of matches for the specific patterns and more. All these statistics are reported per different request groups specified by the user and can be as detailed as needed. The results are written to files that are updated with each new round of the daemon's execution.

An advantage of this tool over the other existing scripts and utilities for monitoring web servers with access log files is its flexibility that allows to solve a various range of monitoring tasks. Such tasks usually require a lot of configuration effort from the network administrators, as the foundation of proper visual monitoring is fine-grained tuning of the tracked requests. Elfstatsd parses the requests found in the access logs and automatically assigns them to the different groups using simple regex-based rules. Also, it provides the settings for the advanced control over the distribution of requests into groups, if such measures are needed. It can process any ELF-based log file and it is only needed to copy the format setting from your web server's config to the elfstasd to make it working.

The daemon's code is written in Python programming language and requires Python 2.6.x/2.7.x to run. Adding Python 3.x support is in plans.

In order to display the statistics aggregated by this daemon in Munin, a number of plugins for it are needed. These plugins are distributed separately and are available in elfstats-munin repository. To simplify the daemon's installation, you can check out elfstats-env repository that contains a Python virtual environment with all the required dependencies (RHEL6 OS is only supported so far by elfstats-env).

Build and install

Unfortunately, the installation procedure is a bit more complicated than usual for Python packages. This happens because elfstatsd is a Linux daemon requiring more permissions than the average Python package does. I did my best to simplify the installation process as much as possible, and you are welcome to share your thoughts on making it even simpler. However, if you install elfstatsd not from the sources, but from an RPM using a provided environment, all you need to do is to install the packages with yum.

Elfstatsd is distributed via the source codes and RPM packages. It is also planned to add elfstatsd to PyPi in future. RPM files for RHEL6_x64 can be found in Releases. Packaging scripts for other POSIX flavors are not yet implemented. Let me know if you are interested in having a package for your OS distribution. Also, if you need in a distribution in a different format, you can build elfstatsd from the sources as explained below.

It is recommended, though not required, to setup a virtual environment for running this daemon. An RPM with such an environment set up and ready to go is maintained in elfstats-env repository. If the provided package does not suit you, you can use default Python installation or create a virtual environment yourself. Instructions to do so are provided below.

Installing elfstatsd from the source codes

Clone the repository: git clone https://github.com/dzzh/elfstatsd.git and enter it with cd elfstatsd.
Switch to the virtual environment by issuing source /path/to/virtualenv/bin/activate if you want to use it. If you want to install elfstatsd using your default Python, you can skip this step. Just make sure that your Python version is either 2.6.x or 2.7.x by running python --version.
Install dependencies. The daemon requires a number of other Python modules to operate. They are listed in requirements.txt and can easily be installed with pip. To do this, install pip and run pip install -r requirements.txt. (If you work with the virtual environment, pip is pre-installed there.)
Install module using its setup script: python setup.py install. Daemon will need in write access to /etc/sysconfig/ for correct installation.
Run post-install script as a root: sudo sh scripts/post-install.sh. This simple script will create directories for internal process logging and .pid file storing as well as will perform some other necessary post-installation procedures.
If you work with a virtual environment or install elfstatsd to a location different for the default for Python packages, make necessary adjustments to /etc/sysconfig/elfstatsd to help the launcher to locate the code.

Building and installing RPM for RHEL 6

Clone the repository: git clone https://github.com/dzzh/elfstatsd.git and enter it with cd elfstatsd.
If you want the resulting RPM to install elfstatsd to a virtual environment, you have to switch to it with source /path/to/virtualenv/bin/activate. To set up such an environment, you can install an RPM from elfstats-env repository. Pre-built RPMs for RHEL6 are available in its Releases.
Build RPM: python setup.py bdist_rpm. After this step, the RPMs will be put in dist/ directory.
Install rpm: sudo yum install dist/elfstatsd-XX.XX.noarch.rpm. This RPM can later be installed to the other machines without being re-built, but these machines should have virtual environment located at the same path as at the build machine or have to use default Python with installed dependencies. You can read about installing the dependencies in Installing elfstatsd from source codes section.
If you work with a virtual environment or install elfstatsd to a location different for the default for Python packages, make necessary adjustments to /etc/sysconfig/elfstatsd to help the launcher to locate the code.

Configure

Elfstatsd can be configured using settings.py file in elfstatsd directory. This file contains all the settings supported by the daemon as well as documentation to them. Please refer to the file for more information.

When updating elfstatsd to the newer version, make sure to review the changes in the configuration file. The daemon is in its early development stage, thus full backward compatibility of the settings is not guaranteed. However, all the changes will be documented in the configuration file.

Run

elfstatsd can be run using a launcher that is installed into /etc/init.d/elfstatsd. To start the daemon, run sudo /etc/init.d/elfstatsd start; to stop the daemon, run sudo /etc/init.d/elfstatsd stop.

To find the location of the daemon's main file, the launcher uses a configuration file placed in /etc/sysconfig/elfstats. By default, it is assumed that elfstatsd is run using default Python and is located in default place for Python packages. If you work with a virtual environment or have elfstatsd installed in a location where Python cannot find it, adjust the settings in /etc/sysconfig/elfstats as needed.

When running, daemon needs in write access to the following directories:

/tmp: to store files with aggregated data
/var/log/elfstatsd: for internal logging
/var/run/elfstatsd: for .pid file

All of these paths can be changed in settings.py. Make sure that a user launching daemon has write access to all of them.

Test

Running unit tests

Elfstatsd uses py.test as its testing framework. It is not defined as requirement for a project and you don't need in it to build and run the daemon. However, in case you want to make changes to the code and run the available tests to make sure your changes didn't break the available functionality, you can execute python setup.py test. If py.test is not installed at your machine, it will be downloaded automatically.

Data visualization with Munin

To show data aggregated with elfstatsd in Munin, a set of plugins parsing aggregated data and sending it to Munin are needed. These plugins can be installed from elfstats-munin repository.

Troubleshooting

If you face problems with the daemon, start troubleshooting with inspecting its logs (they are located in /var/log/elfstatsd by default). If the logs do not contain any significant information to help you detecting the cause of failures, you may try to run the daemon after changing its stdout and stderr paths in elfstatsd_daemon.py from /dev/null to /dev/tty. This will add console logging for its initial launching stage that is not covered by the internal logging. Also, feel free to contact me or raise an issue if you have a problem that you cannot resolve yourself.

License

Elfstatsd is available under the terms of MIT License.

Developed at TomTom. Inspired by Oleg Sigida.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

elfstatsd's People

Contributors

Stargazers

Watchers

elfstatsd's Issues

Time shift specifier in data files

Allow to specify a time shift value when converting the log file name with datetime placeholders to the valid absolute path.

Example: it should be possible to specify log file as '/var/log/httpd/apache-%Y-%m-%d-%H?ts=-3600', where a question mark is a separator and ts=-3600 is time shift in seconds. If this shift is set, daemon should first shift the current time, then convert the file from the shifted datetime value, not the current one.

Generalize work with storages

So far we have four different storages to keep aggregated values. This leads to some duplication and generally less sexy code. It would be better to generalize working with them in a kind of framework allowing to easily create, clean and dump them.

In-place logs: properly read last records of previous file

When using in-place rotating logs, and the daemon is invoked after the access log file is replaced, the records added between previous daemon invocation and file replacement get lost. They have to be properly read if path to the previous file is known.

Storage size may grow large if daemon is running for long

For each processed file daemon keeps its associated seek in the hash map. If daemon will run for a very long time without restart, size of this data structure may become rather large. It is needed to track data there without replacing datetime placeholders to decrease storage size.

Limit number of calls per group

Some groups may have very many different calls that makes it very difficult to analyze the respective charts later on. It is required to add a setting specifying maximal number of calls per group and if a number of calls exceeds it, these calls should be put to a group with a different name, e.g. with auto-incremented number in suffix.

Example. If MAX_CALLS_PER_GROUP is set to 5 and there are 11 calls that should get to group data, first five of them get to group data, next five get to group data2, the last one gets to group data3.

For each request add response codes distribution

By now, daemon only aggregates response codes for all the parsed requests and reports them in a separate section. In addition, it is required to add the respective information per each parsed method.

Gather more aggregated information

So far daemon mostly works with the requests and response codes, but log files contain way more useful information, that can be aggregated and monitored, e.g. average and total number of sent bytes. Think on what useful data can be retrieved, how to abstract parsing these data and build pluggable modules for its retrieval.

Support of several different log formats

If one instance of a daemon processes log files generated with different tools, it is highly likely that the log format will be different between them. Now elfstatsd allows to specify only one format of the log in ELF_FORMAT option. It has to be changed to a list. The user will have to specify the index of this list in DATA_FILES option as one of the tuple elements.

Write metadata to file with aggregated results

Such metadata as parsed period of time, daemon invocation time and daemon version should be added to aggregated file into metadata section.

Proper functional testing of a flow

elfstatsd_class has to be covered with functional tests

Count aggregated values for occurrences of specific patterns in URLs

It is requested to add a possibility to extract certain patterns from the logs, count either number of distinct values or total number of occurrences and report it. This can be used e.g. to count total number of [distinct] users that accessed the server within trackable period.

Add support for Python 2.4.x and 2.5.x

Daemon should work correctly with Python 2.4.x-2.7.x.

Protection against non-defined settings

If a setting is missing, try to substitute it with the default value.

Write execution time to metadata

Write to the report file the time it took daemon to process the log file. This should help with investigating the performance issues, if they will arise because of a large number of (poor) regexes.

Better separation of regexes for different log files

If a user specifies several different log files to be processed by a single elfstatsd instance, all of them will have to share the same regexes in the settings, as they are applied globally. It is needed to allow specifying regexes per each log file separately, as it will be more efficient and flexible

Allow to specify percentiles in settings

Now the percentiles that are aggregated are hard-coded. Instead, it is required to submit their list in settings

Add support for Python 3.x

Daemon should work correctly with Python 3.0.x-3.3.x

Show number of skipped and not parsed requests

It is required to create an additional section for aggregated data and display number of skipped calls and calls that were not parsed at all there.

In-place logs: sometimes daemon reads old file again in full

When an access log that rotates in-place is re-created, daemon sometimes incorrectly processes this situation and reads whole old file instead of reading a new one. It is required to investigate an issue, find a cause of a problem and resolve it.

As a quick solution it is possible to create an additional check for time period start when a daemon starts to read a file. Now only time period end check is performed, the daemon assumes that the period start is set correctly after seek is done and it starts to read a file.

Add py.test support to setup.py

Integrate setup script with py.test as described in section 1.4.4 of py.test documentation. After doing this, unit tests will be run automatically each time the project is being built.

If processing of one file gives an error, also process the other files.

If we have several different logging files now and an error is raised in one of them, it is caught in the exceptions handler but the remaining files don't get processed till the next round. This has to be fixed.

Improve logging for missing files

When a daemon faces a missing file, it reports three ERROR lines in log and prints a trace. Instead, only one log line has to be created.

In certain situations daemon restarts immediately

In some situations when the daemon faces an error it restarts immediately instead of waiting for for the time interval to pass. This occurs if a data file with access log is not found and sometimes with in-place rotating logs when a new log file is created.

It is required to investigate the issue to find its cause and prevent this behavior. In case of a severe error, the daemon should report it and wait for the whole time interval between runs to restart, not restart immediately.

Make daemon independent from environment variables

Instead of using environment variables to define location of virtualenv and/or module, use /etc/sysconfig/. If not having any settings set there, working with default Python installer has to be assumed.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.