Giter VIP home page Giter VIP logo

hdrhistogram_py's Introduction

HdrHistogram

Gitter Java CI Javadocs

HdrHistogram: A High Dynamic Range (HDR) Histogram

This repository currently includes a Java implementation of HdrHistogram. C, C#/.NET, Python, Javascript, Rust, Erlang, and Go ports can be found in other repositories. All of which share common concepts and data representation capabilities. Look at repositories under the HdrHistogram organization for various implementations and useful tools.

Note: The below is an excerpt from a Histogram JavaDoc. While much of it generally applies to other language implementations as well, some details may vary by implementation (e.g. iteration and synchronization), so you should consult the documentation or header information of the specific API library you intend to use.


HdrHistogram supports the recording and analyzing of sampled data value counts across a configurable integer value range with configurable value precision within the range. Value precision is expressed as the number of significant digits in the value recording, and provides control over value quantization behavior across the value range and the subsequent value resolution at any given level.

For example, a Histogram could be configured to track the counts of observed integer values between 0 and 3,600,000,000 while maintaining a value precision of 3 significant digits across that range. Value quantization within the range will thus be no larger than 1/1,000th (or 0.1%) of any value. This example Histogram could be used to track and analyze the counts of observed response times ranging between 1 microsecond and 1 hour in magnitude, while maintaining a value resolution of 1 microsecond up to 1 millisecond, a resolution of 1 millisecond (or better) up to one second, and a resolution of 1 second (or better) up to 1,000 seconds. At its maximum tracked value (1 hour), it would still maintain a resolution of 3.6 seconds (or better).

The HdrHistogram package includes the Histogram implementation, which tracks value counts in long fields, and is expected to be the commonly used Histogram form. IntHistogram and ShortHistogram, which track value counts in int and short fields respectively, are provided for use cases where smaller count ranges are practical and smaller overall storage is beneficial.

HdrHistogram is designed for recording histograms of value measurements in latency and performance sensitive applications. Measurements show value recording times as low as 3-6 nanoseconds on modern (circa 2012) Intel CPUs. AbstractHistogram maintains a fixed cost in both space and time. A Histogram's memory footprint is constant, with no allocation operations involved in recording data values or in iterating through them. The memory footprint is fixed regardless of the number of data value samples recorded, and depends solely on the dynamic range and precision chosen. The amount of work involved in recording a sample is constant, and directly computes storage index locations such that no iteration or searching is ever involved in recording data values.

A combination of high dynamic range and precision is useful for collection and accurate post-recording analysis of sampled value data distribution in various forms. Whether it's calculating or plotting arbitrary percentiles, iterating through and summarizing values in various ways, or deriving mean and standard deviation values, the fact that the recorded data information is kept in high resolution allows for accurate post-recording analysis with low [and ultimately configurable] loss in accuracy when compared to performing the same analysis directly on the potentially infinite series of sourced data values samples.

A common use example of HdrHistogram would be to record response times in units of microseconds across a dynamic range stretching from 1 usec to over an hour, with a good enough resolution to support later performing post-recording analysis on the collected data. Analysis can include computing, examining, and reporting of distribution by percentiles, linear or logarithmic value buckets, mean and standard deviation, or by any other means that can be easily added by using the various iteration techniques supported by the Histogram. In order to facilitate the accuracy needed for various post-recording analysis techniques, this example can maintain a resolution of ~1 usec or better for times ranging to ~2 msec in magnitude, while at the same time maintaining a resolution of ~1 msec or better for times ranging to ~2 sec, and a resolution of ~1 second or better for values up to 2,000 seconds. This sort of example resolution can be thought of as "always accurate to 3 decimal points." Such an example Histogram would simply be created with a highestTrackableValue of 3,600,000,000, and a numberOfSignificantValueDigits of 3, and would occupy a fixed, unchanging memory footprint of around 185KB (see "Footprint estimation" below).

Histogram variants and internal representation

The HdrHistogram package includes multiple implementations of the AbstractHistogram class:

  • Histogram, which is the commonly used Histogram form and tracks value counts in long fields.
  • IntHistogram and ShortHistogram, which track value counts in int and short fields respectively, are provided for use cases where smaller count ranges are practical and smaller overall storage is beneficial (e.g. systems where tens of thousands of in-memory histogram are being tracked).
  • AtomicHistogram and SynchronizedHistogram (see 'Synchronization and concurrent access' below)

Internally, data in HdrHistogram variants is maintained using a concept somewhat similar to that of floating point number representation: Using an exponent a (non-normalized) mantissa to support a wide dynamic range at a high but varying (by exponent value) resolution. AbstractHistogram uses exponentially increasing bucket value ranges (the parallel of the exponent portion of a floating point number) with each bucket containing a fixed number (per bucket) set of linear sub-buckets (the parallel of a non-normalized mantissa portion of a floating point number). Both dynamic range and resolution are configurable, with highestTrackableValue controlling dynamic range, and numberOfSignificantValueDigits controlling resolution.

Synchronization and concurrent access

In the interest of keeping value recording cost to a minimum, the commonly used Histogram class and it's IntHistogram and ShortHistogram variants are NOT internally synchronized, and do NOT use atomic variables. Callers wishing to make potentially concurrent, multi-threaded updates or queries against Histogram objects should either take care to externally synchronize and/or order their access, or use the ConcurrentHistogram, AtomicHistogram, or SynchronizedHistogram or variants.

A common pattern seen in histogram value recording involves recording values in a critical path (multi-threaded or not), coupled with a non-critical path reading the recorded data for summary/reporting purposes. When such continuous non-blocking recording operation (concurrent or not) is desired even when sampling, analyzing, or reporting operations are needed, consider using the Recorder and SingleWriterRecorder recorder variants that were specifically designed for that purpose. Recorders provide a recording API similar to Histogram, and internally maintain and coordinate active/inactive histograms such that recording remains wait-free in the presence of accurate and stable interval sampling.

It is worth mentioning that since Histogram objects are additive, it is common practice to use per-thread non-synchronized histograms or SingleWriterRecorders, and use a summary/reporting thread to perform histogram aggregation math across time and/or threads.

Iteration

Histograms support multiple convenient forms of iterating through the histogram data set, including linear, logarithmic, and percentile iteration mechanisms, as well as means for iterating through each recorded value or each possible value level. The iteration mechanisms are accessible through the HistogramData available through getHistogramData(). Iteration mechanisms all provide HistogramIterationValue data points along the histogram's iterated data set, and are available for the default (corrected) histogram data set via the following HistogramData methods:

  • percentiles: An Iterable<HistogramIterationValue> through the histogram using a PercentileIterator
  • linearBucketValues: An Iterable<HistogramIterationValue> through the histogram using a LinearIterator
  • logarithmicBucketValues: An Iterable<HistogramIterationValue> through the histogram using a LogarithmicIterator
  • recordedValues: An Iterable<HistogramIterationValue> through the histogram using a RecordedValuesIterator
  • allValues: An Iterable<HistogramIterationValue> through the histogram using a AllValuesIterator

Iteration is typically done with a for-each loop statement. E.g.:

 for (HistogramIterationValue v :
      histogram.getHistogramData().percentiles(ticksPerHalfDistance)) {
     ...
 }

or

 for (HistogramIterationValue v :
      histogram.getRawHistogramData().linearBucketValues(unitsPerBucket)) {
     ...
 }

The iterators associated with each iteration method are resettable, such that a caller that would like to avoid allocating a new iterator object for each iteration loop can re-use an iterator to repeatedly iterate through the histogram. This iterator re-use usually takes the form of a traditional for loop using the Iterator's hasNext() and next() methods.

So to avoid allocating a new iterator object for each iteration loop:

 PercentileIterator iter =
    histogram.getHistogramData().percentiles().iterator(ticksPerHalfDistance);
 ...
 iter.reset(percentileTicksPerHalfDistance);
 for (iter.hasNext() {
     HistogramIterationValue v = iter.next();
     ...
 }

Equivalent Values and value ranges

Due to the finite (and configurable) resolution of the histogram, multiple adjacent integer data values can be "equivalent". Two values are considered "equivalent" if samples recorded for both are always counted in a common total count due to the histogram's resolution level. HdrHistogram provides methods for determining the lowest and highest equivalent values for any given value, as well as determining whether two values are equivalent, and for finding the next non-equivalent value for a given value (useful when looping through values, in order to avoid a double-counting count).

Corrected vs. Raw value recording calls

In order to support a common use case needed when histogram values are used to track response time distribution, Histogram provides for the recording of corrected histogram value by supporting a recordValueWithExpectedInterval() variant is provided. This value recording form is useful in [common latency measurement] scenarios where response times may exceed the expected interval between issuing requests, leading to "dropped" response time measurements that would typically correlate with "bad" results.

When a value recorded in the histogram exceeds the expectedIntervalBetweenValueSamples parameter, recorded histogram data will reflect an appropriate number of additional values, linearly decreasing in steps of expectedIntervalBetweenValueSamples, down to the last value that would still be higher than expectedIntervalBetweenValueSamples.

To illustrate why this corrective behavior is critically needed in order to accurately represent value distribution when large value measurements may lead to missed samples, imagine a system for which response times samples are taken once every 10 msec to characterize response time distribution. The hypothetical system behaves "perfectly" for 100 seconds (10,000 recorded samples), with each sample showing a 1msec response time value. At each sample for 100 seconds (10,000 logged samples at 1 msec each). The hypothetical system then encounters a 100 sec pause during which only a single sample is recorded (with a 100 second value). The raw data histogram collected for such a hypothetical system (over the 200 second scenario above) would show ~99.99% of results at 1 msec or below, which is obviously "not right". The same histogram, corrected with the knowledge of an expectedIntervalBetweenValueSamples of 10msec will correctly represent the response time distribution. Only ~50% of results will be at 1 msec or below, with the remaining 50% coming from the auto-generated value records covering the missing increments spread between 10msec and 100 sec.

Data sets recorded with and without an expectedIntervalBetweenValueSamples parameter will differ only if at least one value recorded with the recordValue method was greater than its associated expectedIntervalBetweenValueSamples parameter. Data sets recorded with an expectedIntervalBetweenValueSamples parameter will be identical to ones recorded without it if all values recorded via the recordValue calls were smaller than their associated (and optional) expectedIntervalBetweenValueSamples parameters.

When used for response time characterization, the recording with the optional expectedIntervalBetweenValueSamples parameter will tend to produce data sets that would much more accurately reflect the response time distribution that a random, uncoordinated request would have experienced.

Footprint estimation

Due to its dynamic range representation, Histogram is relatively efficient in memory space requirements given the accuracy and dynamic range it covers. Still, it is useful to be able to estimate the memory footprint involved for a given highestTrackableValue and numberOfSignificantValueDigits combination. Beyond a relatively small fixed-size footprint used for internal fields and stats (which can be estimated as "fixed at well less than 1KB"), the bulk of a Histogram's storage is taken up by its data value recording counts array. The total footprint can be conservatively estimated by:

 largestValueWithSingleUnitResolution =
        2 * (10 ^ numberOfSignificantValueDigits);
 subBucketSize =
        roundedUpToNearestPowerOf2(largestValueWithSingleUnitResolution);

 expectedHistogramFootprintInBytes = 512 +
      ({primitive type size} / 2) *
      (log2RoundedUp((highestTrackableValue) / subBucketSize) + 2) *
      subBucketSize

A conservative (high) estimate of a Histogram's footprint in bytes is available via the getEstimatedFootprintInBytes() method.

hdrhistogram_py's People

Contributors

0xflotus avatar 5hubh4m avatar ahothan avatar asherf avatar astraw avatar dimp-gh avatar filipecosta90 avatar fruch avatar gokarslan avatar kannanekanath avatar lebinh avatar nyonson avatar phensley avatar rrva avatar snyk-bot avatar thejcannon avatar yqmmm avatar zeulb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hdrhistogram_py's Issues

Unittests fail on 32-bit systems

While working on #19 I've uncovered a few test failures which should probably be addressed.

For example, on 32-bit systems, test_basic fails with AssertionError: 21 != 22. This is due to get_bucket_count() using sys.maxsize to calculate bucket count, which makes it return 22 on 64-bit platforms and 21 on 32-bit based ones. Not sure yet if it's calculation bug or sys.maxsize should be replaced with something else in the calculation (can you clarify?).

On the same 32-bit system test_hdr_payload just crashes pytest. No idea about how this happens yet.

I also get the impression that the project would benefit from having a CI set up: Appveyor for regularly running Windows tests, TravisCI for doing the same on Linux/Mac. I can provide some help with setting this up, if you're interested.

error while decoding serialized histogram produced by rust version

I am generating histograms in Rust and am deserializing in Python using the HDR Histogram libraries for Rust and Python. The Rust code produces a byte array with the encoded histogram which ends up as a bytes instance in Python. (The project is a Python program that integrates with a Rust library via the cpython crate.)

It appears that the Python library is only able to decode the encoded histogram if Rust encodes using hdrhistogram::serialization::V2DeflateSerializer and further encodes it using base64 (via Python's base64.b64encode).

Without the base64 encoding, decoding with histogram = HdrHistogram.decode(encoded_histogram, b64_wrap=False) results in this error:

Traceback (most recent call last):
  ...
  File "XXX/hdrhistogram-0.8.0-cp38-cp38-macosx_10_15_x86_64.whl-install/hdrh/histogram.py", line 580, in decode
    hdr_payload = HdrHistogramEncoder.decode(encoded_histogram, b64_wrap)
  File "XXX/hdrhistogram-0.8.0-cp38-cp38-macosx_10_15_x86_64.whl-install/hdrh/codec.py", line 356, in decode
    hdr_payload = HdrPayload(8, compressed_payload=cpayload)
  File "XXX/hdrhistogram-0.8.0-cp38-cp38-macosx_10_15_x86_64.whl-install/hdrh/codec.py", line 146, in __init__
    self._decompress(compressed_payload)
  File "XXX/hdrhistogram-0.8.0-cp38-cp38-macosx_10_15_x86_64.whl-install/hdrh/codec.py", line 197, in _decompress
    self._data = zlib.decompress(compressed_payload)
zlib.error: Error -3 while decompressing data: incorrect header check

Using uncompressed encoding (via hdrhistogram::serialization::V2Serializer in Rust) and base64 encoding in Python results in this error:

Traceback (most recent call last):
  ...
  File "XXX/hdrhistogram-0.8.0-cp38-cp38-macosx_10_15_x86_64.whl-install/hdrh/histogram.py", line 580, in decode
    hdr_payload = HdrHistogramEncoder.decode(encoded_histogram, b64_wrap)
  File "XXX/hdrhistogram-0.8.0-cp38-cp38-macosx_10_15_x86_64.whl-install/hdrh/codec.py", line 346, in decode
    raise HdrCookieException()
hdrh.codec.HdrCookieException

And using uncompressed encoding without base64 results in:

Traceback (most recent call last):
  ...
  File "XXX/hdrhistogram-0.8.0-cp38-cp38-macosx_10_15_x86_64.whl-install/hdrh/histogram.py", line 580, in decode
    hdr_payload = HdrHistogramEncoder.decode(encoded_histogram, b64_wrap)
  File "XXX/hdrhistogram-0.8.0-cp38-cp38-macosx_10_15_x86_64.whl-install/hdrh/codec.py", line 356, in decode
    hdr_payload = HdrPayload(8, compressed_payload=cpayload)
  File "XXX/hdrhistogram-0.8.0-cp38-cp38-macosx_10_15_x86_64.whl-install/hdrh/codec.py", line 146, in __init__
    self._decompress(compressed_payload)
  File "XXX/hdrhistogram-0.8.0-cp38-cp38-macosx_10_15_x86_64.whl-install/hdrh/codec.py", line 197, in _decompress
    self._data = zlib.decompress(compressed_payload)
zlib.error: Error -3 while decompressing data: incorrect header check

Doesn't install correctly in 3.6.0b2?

I have a project using hdrhistogram under pyenv, works great with 3.5.2.

Recently I tried 3.6.0b2, and hdrhistogram does not install. The 'hdrh' directory is not created.

$ pyenv install 3.6.0b2
$ pyenv global 3.6.0b2
$ python -m pip install -r requirements.txt
...
Collecting hdrhistogram==0.4.1 (from -r requirements.txt (line 17))
  Using cached hdrhistogram-0.4.1.tar.gz
...
  Running setup.py install for hdrhistogram ... done
...
$ python
Python 3.6.0b2 (default, Oct 27 2016, 11:08:44) 
>>> import hdrh
ModuleNotFoundError: No module named 'hdrh'
>>>^D
$ cd ~/.pyenv
$ find . -name hdrh
./versions/3.5.2/lib/python3.5/site-packages/hdrh

There are other eggs in my requirements.txt, so it's not that all eggs are broken.

I'm not very good at debugging this sort of thing, I'm afraid, but I'm happy to poke at it a bit if that would help.

feature request: HistogramLogReader accepts already open files

It would be useful if HistogramLogReader would accept file-like objects (and not just filenames) as input. This would allow reading a log directly from a zip file, for example.

Here is an example of the desired usage.

# here using zipfile from the python stdlib
with zipfile.ZipFile(file="my_filename.zip",mode='r') as archive:
    fd = archive.open('my_hist.hlog')
    accumulated_histogram = hdrh.histogram.HdrHistogram(LOWEST, HIGHEST, SIGNIFICANT)
    log_reader = HistogramLogReader(fd, accumulated_histogram)

HistogramLogReader fails in Windows with Python 3

With hdrhistogram 0.9.0 (and git master) on Windows with Python 2.7.18 (Anaconda) and Ubuntu 20.04 with Python 3.8.10, the following program works fine.

The program:

from hdrh.histogram import HdrHistogram
from hdrh.log import HistogramLogReader

h=HdrHistogram(1,100000,2)
rdr = HistogramLogReader('histogram.hlog', h)
h1 = rdr.get_next_interval_histogram()
print(h1.get_total_count())

The hisogram.hlog file mentioned above:

#[StartTime: 1635177123.452 (seconds since epoch)]
7.888,60.535,741375.000,HISTFAAAAZR4nC1O0WrUQBRN7s7OxtsxncaYDWFZ46KLlCUspSxVo4YhrCGEJQxxiCWIBqmylCI+iKiIgj8gPvsNPvgB/pKf4K3thTnn3HPPzNzJtx/XLMv+bV0UXLL9H3dV9ujvpeMguMLxAz+cB/FsuUwWSqujhzrflLoxdWZ0szlpu62uemP6vK4KU1dNtdaqfKZUuW71Kn2uVF20GVmbw+woqVZ52SzSbJ1Wuu11Xrf6Q/P5U3Vqjt+cnjX926Ltz7bv3h9/sV/X5mTbma7rX3apKejPIs9y2iBd5qU6uJdMDxer+XSxH+2vojxZyuTOfHIrjiduKONpFB14s1kUhkEceWGIXhAJX7iewDD0UaIvEYUQyB0mAkmEjHNKCImO8JAzlJ4n0UUuPAfRQeEwSgkXmeRUFkMGjDMHLQBgzAJOggOBJQbABjCkDNuB8xqSPYIdGBDehNvE1LMR7MIV0kM6ezQdwXW4S+oGvXR+g8ZPYAxwlcQejNmYfbfhMbyi9qcND+Argz8D+Aj34ZcNTyn2Av4B9/VDyA==

However, on Windows with Python 3.8.10, I get this error:

Traceback (most recent call last):
  File ".\fail_hdrh.py", line 6, in <module>
    h1 = rdr.get_next_interval_histogram()
  File "C:\Users\drand\anaconda3\envs\pm21-ox\lib\site-packages\hdrh\log.py", line 346, in get_next_interval_histogram
    return self._decode_next_interval_histogram(None,
  File "C:\Users\drand\anaconda3\envs\pm21-ox\lib\site-packages\hdrh\log.py", line 296, in _decode_next_interval_histogram
    histogram = HdrHistogram.decode(cpayload)
  File "C:\Users\drand\anaconda3\envs\pm21-ox\lib\site-packages\hdrh\histogram.py", line 585, in decode
    histogram = HdrHistogram(payload.lowest_trackable_value,
  File "C:\Users\drand\anaconda3\envs\pm21-ox\lib\site-packages\hdrh\histogram.py", line 121, in __init__
    results = hdr_payload.init_counts(self.counts_len)
  File "C:\Users\drand\anaconda3\envs\pm21-ox\lib\site-packages\hdrh\codec.py", line 166, in init_counts
    results = decode(self._data, payload_header_size, addressof(self.counts),
OverflowError: Python int too large to convert to C long

InvocationError using latest version of tox

InvocationError occurs when running tox using the latest version (2.3.1)

GLOB sdist-make: /Users/zeulb/Dropbox/Programming/Cisco/HdrHistogram_py/setup.py
py27 inst-nodeps: /Users/zeulb/Dropbox/Programming/Cisco/HdrHistogram_py/.tox/dist/hdrhistogram-0.5.1.zip
py27 installed: flake8==2.5.4,future==0.15.2,hdrhistogram==0.5.1,mccabe==0.4.0,pbr==1.9.1,pep8==1.7.0,py==1.4.31,pyflakes==1.0.0,pytest==2.9.1
py27 runtests: PYTHONHASHSEED='2204800212'
py27 runtests: commands[0] | py.test -q -s --basetemp=/Users/zeulb/Dropbox/Programming/Cisco/HdrHistogram_py/.tox/py27/tmp \

no tests ran in 0.00 seconds
ERROR: file not found: \
ERROR: InvocationError: '/Users/zeulb/Dropbox/Programming/Cisco/HdrHistogram_py/.tox/py27/bin/py.test -q -s --basetemp=/Users/zeulb/Dropbox/Programming/Cisco/HdrHistogram_py/.tox/py27/tmp \\'
py3 inst-nodeps: /Users/zeulb/Dropbox/Programming/Cisco/HdrHistogram_py/.tox/dist/hdrhistogram-0.5.1.zip
py3 installed: flake8==2.5.4,future==0.15.2,hdrhistogram==0.5.1,mccabe==0.4.0,pbr==1.9.1,pep8==1.7.0,py==1.4.31,pyflakes==1.0.0,pytest==2.9.1
py3 runtests: PYTHONHASHSEED='2204800212'
py3 runtests: commands[0] | py.test -q -s --basetemp=/Users/zeulb/Dropbox/Programming/Cisco/HdrHistogram_py/.tox/py3/tmp \

no tests ran in 0.00 seconds
ERROR: file not found: \
ERROR: InvocationError: '/Users/zeulb/Dropbox/Programming/Cisco/HdrHistogram_py/.tox/py3/bin/py.test -q -s --basetemp=/Users/zeulb/Dropbox/Programming/Cisco/HdrHistogram_py/.tox/py3/tmp \\'
pep8 inst-nodeps: /Users/zeulb/Dropbox/Programming/Cisco/HdrHistogram_py/.tox/dist/hdrhistogram-0.5.1.zip
pep8 installed: flake8==2.5.4,future==0.15.2,hdrhistogram==0.5.1,mccabe==0.4.0,pbr==1.9.1,pep8==1.7.0,py==1.4.31,pyflakes==1.0.0,pytest==2.9.1
pep8 runtests: PYTHONHASHSEED='2204800212'
pep8 runtests: commands[0] | flake8 hdrh test
_____________________________________________ summary ______________________________________________
ERROR:   py27: commands failed
ERROR:   py3: commands failed
  pep8: commands succeeded

Histograms are encoded with the wrong endianness on PyPy

Moving this issue from HdrHistogram/HdrHistogramJS#2

(I haven't quite located where the bug is here yet, or if it belongs upstream but it definitely doesn't seem to belong where I first left it).

virtualenv --python pypy hdrh && hdrh/bin/pip install hdrhistogram && hdrh/bin/python -c 'from hdrh.histogram import HdrHistogram; print HdrHistogram(1, 2, 3).encode()'; rm -rf hdrh

produces a histogram which is encoded using the wrong endianness (and therefore cannot be decoded with other HDR histogram tools.)

#12 is tangentially related.

HistogramLogReader does not support tags

The histogram log format supports internal lines leading with a "tag" to distinguish multiple different histograms in a file. These appear as a prefix of Tag=something, to the histogram line. Here is an example encoding:

#[BaseTime: 1590784908.935 (seconds since epoch)]
Tag=processing,0.000,1.000,442111.000,HISTFAAABN54nC1VTYscVRStOXXr1qvXr19XT3fb0xmTyTAMQxhCGIJIkBCCiIQsgoi4kOAPyCK4cOFeyA8I4sqFCxfiyqU/Q1z6J/wJnnPfVHVVvbqf5557X/WHP77fdN2Uu3b0t8+jeHn/rnv2XxP8mfDzEUa8xIQVXuEJBuApBWsssMPOLsHFI+x5XlC8wBf4CF/zGnDAGa+Jmi95P6WkwwYPafM9sGSE5/hGYe0tHU/RAzfU7ana8bmi9Rmfayo6hHMsZNjSb7jacD0x8yqki7h6RhhB2063U0aUzcHXkR93Ges0nM5tR9WO4qforaNMClOayZaIcJsIvGMha+o3VN3ldaHIEzpT5AXXg/E2MjF4HygceC6CtEjQMPdhdJfPdYTahIHwvuBdtYpelTqElMoJz8J9EdEuohRE7E9I8htaPGicrPGYYkG4Jv8DPsVrrlTzujntWQPle3HPV6LHFYOLwyVtz6m7wE9Ke4F/odwH/IIga4XfjtC4uqb2hoNwFXV2VD1lqGUgnSi94f0H2j3Dd6YydnRf4p1m6CGhkrAuCn/ExDsOhijZ4y2I8jUNOWwPot4h8I7s0Es+r/n+LS7bIDxEFPQ5EHQebIBpVtiOYO4i4PS2CEYGfIXI8soAm4KRCaFYNQ/9lPCKhIqV4PnmlnWtzyLPE+AzJYoYE6I69n2iV9DfB9Wy1Oiql63Zsvs4oi057yNemNZq/BWNBpFvmhjG+f2IreNhRqxBBLTQqKwsUPZO+qxzGlhYahq4c+C9HlHM0IoyJIIlhQznC3O6UUid07O3QMTVGWNy0YZ4jCpO0YiG2rVC4LUuwK6sT2YdAXfmFDufh2g+Fkyj2RIZ1F8zhitIx8L6mNqgWbCXwmrC44idh9YiT9Jay6g5cUt6pYndyvkbjVuUDXaniKltaHjdm1ove6p6l1zb2bRJeUyUsOBzmbXeryx8lf5xzJNmrterm7fdcrilVfONxMWi0YGiLo3wKmy2IFbPClVkOMZ8DKaGISpRuclF6bIR0Vk6To2DQG0sliYW3e9zEkPqFTvYM5kQq4OjhxDC4MhOmy4nUdZY6ix7fCIIrdnlYCjIpasap4LVruDAYl6SKRRh3c6W4ItQpJzomhRU/EtiM1JJjJWtJqauxZSMxSSfXViiaJGlcJ5ZWR8ICSPrFCN+7CmgXNJIJzkzTylzGumj0RAtHhCLe2OmK6USQZBi+gRZDDibZs6JcKx8sOBbbCo+F+KdE8BQKVWWo+Zqc+VqteTMCi0Vz8zsSWA1irIlI0ob7faYN3rWPNjMmlx4aMMVI2tfkCZtQlNPetbhVmmfUfK9k/mkzHl7//6dejJvj7clpzvFSGKu83F2m1OumYJtqTM5rURGT7J8XD+YKxe1yyWrIVkN4AwwmdWaSc3WZ/W28zrThGzM25zpVspc750QYWuI4JP+EbGn1NOJcbhRRSuVLETdde0PHRozDg7ZSucmHsimNrn5pWbGNAD8hos78wdkYpO8lAFpZjaNByNmW5Fhcp8Chld+a9jaBTL9tiVBGk3mSJJSz2sVHeAQpz1yLdiU6mOKEF48LZ3rZaHNyK6taJbTks0LjzJ4xTpEDFc4byUTMQ27mjFkTwe+zO7P8UdUPcXZ/lsnNmrEX9wX+CdUv/YsUX8Mf+tL9z8X+FFT
Tag=sojourn,0.000,1.000,449535.000,HISTFAAAD/14nE1YQavkSHLWpLKy8mVn52iytWmNrNUKWauVZVEUhRBFUZTLRfFoHk0zNLNDM5g5NHNYFh+M2cNi9uCDwRh821/lf+Orb/4iUvXWr7vqVaVSGRFfRHzx6VX//ud3SfJLk8SfdP39FX/5838kf/8/ceG/tUjFf36Ft9+KjXgSb8Tfiq/xTfxavMWvtyKRCd7/RuDaO/GN+CsspmKLvQneBV70PaEPQr4VW+zGzgS3YQtf2dJVia3iW3p/K2gL/9/S1lSSNdwgvsXdgm/HnbTOW1LxG7qd7K2WErHhfRLbJFyUKX2K5nBbIuQv8IUuIhqJHSKez1uwlSLi729FNE1bJHnMp0u+JxExJFhVtJbSb8k3bLEqnzgM8oXPAyBSkYcqwsLnI4QnRpFW2Xv28a3ckplUERxJRCMaTrSiMFf/cAOHCGtvyNltBIU9pihxz4YubhmvCDa5vmG0lXiSFEvyMMEvyStrZrBHJVLR/pgPmSgcmawI0R4Okk7EuZJ9IiOcUzoJLxW3S/lq4/UD70GcCZmhbfSz4SOxW0n2R+FNI1D6wAf9JQ8RGM6t5P8MTkSWUyRxsOKgcAbBHt2Vkv3HVr06GdO/7n98/Iu/bE/hNk2b2U6iVGo28nEVNaKMIBc1hS8VOZ7IDVvHR8nQKEqH+Fom/FU+zlf8Sl/TLLnc4DItAPOYbmk2ai1NAjWClHIuZUy75DM04cWLlupNmHWzJoPKwB1rtYK3hDbCwBWlDaxIaRT9CASCzfikETRlEJ+9pJjYTazjUAXwABaWtcABFOK6RUsdXU8Mr3E8UmMLn8vRS3IT68QudI8iuISlD5TYVMu1hDU5TKcLaRUlDIdjFzYTaoQoA8CFS/GR/2SQXGNv2GnpKAucb7KJU1QsJ5xGpuka+aAow5SAaBK+oQUQIW1SnFuqKwO3pErZgZheLkvFEMvYLatfMKHAP5pMcfIjkTAidLIj2GPHCk19oKgkQQuSixWrT5LJgM+iuo41gSWtOd0pFwingtmHDTPEKh67urPRXFxqrfxHB1JXpATSlnllo1LBlLVSZSJjlcoNf1CMOoUaSxLFkMaSTVeq25BzT0KtdEneJDEbuJxGhqA75EpID3YTkei3tFU92o6JaOUf9oT5RlGwTysHw5giq9uVDDfUfYInBdEo0yoDH21Te8vtw3IcCXEaRSp5vDaRBlMebgwJBbPhmkgZnpTDTuS3Ivok4wQSqYrxRFfF2rBMRonaKIaE60A++nqdPGLdGXs3MoCUrweItWi4tlOm2JUgiViUfMWVeZOYAnVHs0nTkehkvGQaj9s+hk2ymk0j/m8lp1cwogkDmfL041Rv49D+FmvfAMOvH4dsVw3whH9bSIB3LAWw71ua1XGS0NsTbsJIpcz9deRQfH2Df9j9Fll+YqNv1rHyJH7JBn6Frd9g5Y18h9NZEnDW3jHY4hckQeIHytobfPkNa4WvsfdNFBeC7nwM+JSG5ZYNreBvo02xzr13sa1T+SRjJURZ8USjYbOCxxqE0P+7mDWe6mgKap/N2vd89qNI16bnlG6U4spJ18rjGU7voM0oSygX8rUvYnEwi7N6oSHDPJMy4aaxSJirKOvMn4ytiLIrtiawAP5PMq48SVJuv2IbUY6Ra1EZxITF8QomJpaNw5QpKrJMBExGPSfiJ7mN1lN2bvOYk+KhVnhCiDhX5YMoiYFjd4vIb4S4FslrvQv1FKdcQqE+iEHIV0EmLMWqEksuUGsSD2zYeGT72FvIzQrJo+hXsUDEjChTyTRKaxu5aksmr4iE3siI/mZ1gOowMvQ29ofcylVmxbaOHaxWiovdQ4cppuYH9RE9sVtvokojbN/yCtSpZLW1jUdS3lS6DnO6ka6RSkzW6mCrctVmseDg3RuQBcOaalZwMcNPaygR582rO9HX15myla+kEkNScYRt5NoxKT8UREWbxvp7iEgicuYy5tyoCklVxDiiamThG2csQ8VKiLlO8wBCcdAAtTxA4JWhk7l9SAxAd4DV8AFqxEInCKMV6R8hDC5riTVFXUVbFQsUvWqGN8JSgHHkyYek5fHFNZrKR8kpRjZWa6SzWHUMCNcEdQWpB9gUbJ2InhWnYQ1D44rUBouPKCMYQ1JJ0F0k1yJkJD8Eawu6gl8Iy3AWlV6LNrWpWCHm+t8YFcWXosywxOEyi/WlogMrpZuoXlnfxIxq2qt1Kh+ak7Qb/9ZMgGtpQL7oVbJEfiHhLExKOeKJ+piiBJpMV4Uit6QbE9KQj3nGYHIGGVANJmSRQimMAUmelIpKKfpN4o1FaWI01UnKIo3jBzykl3CJ5AtXButolpXANCbOUA1pkrgysZLlL+5g6rRRo1FVRCnKACTCr2NWxUcXzaLJsN6l81EilA+VPJh6YzdKraIsiRyAFk7WBzbJ7rGGiDOaUxkTKlhHECSpiv26PgKwNiEgVsogmassdwnVE3pZEUiEGwtolr6kKB4jgGLUURPCWcUVGTE1wmmErg0/z+Capv4icpA0VRSLPEZGRgVNpztyBkojSmyCm04z66MV6UECkdIBm9Yx+a1Vqv+fEOWVlBk3MZnSDvJemQwNZCW8QlFJq3MJF53INLpZSYeHFThrHPB3LMSdis9OUVDL3BprdKYt8HHY6J0L2jkEkwuJZW2cM6AH42ywJsMHq7kipMEjiYEVHOtspp3PpM9M7kLQHma7zOa2qPPcO+N1sK6ySemcqmADBjLnFVzD6cHK2ipqGZ3jTGBvyFHqS0lVqzR+AIzJqI2UZaRh2dEjlrT82AMItYYdZ/I8c9ZpnRmnrXchs/gEKzrDfmcL63KfWY1QnMtdnjtjfZZrLOq8cpVBGMH5HHc4AFjgxEx745X3tgJNAufcIwKYJrYkJ/mREB1iiCQBs/bSGO3pWUhID44CshniB1ZKe22tS6xTGSIhBJ0WGY5Qzhjchngc0ZZFRmoHN/nhSThFGOss8x40aU1pQ5nDo1wHVwRTOiuLEulBOLktTePyModJvJm8RAjC0I9TSFOhEXbmi4xg8IAnGF+EvPKmdqrIvEWKkXWKS8FNawAaSgQygToY/Q0/MRoApAbuVhk8CFtnAQowtXQJSw6u4RyYz5xGFTkqRu48JhnqvvjgRn1mkpXc+I8GVK2OGgEpBg6JxjkgDMPkL6Pk0KsGEvHpOI2DxiJKoK2BUEa/gRohTvBzQyp+PEe8xAHY5Mgr1CLcRQsQrIYqRtssQ0CuytAIiAreOO3GvixrPzrvcLYPpcuK0tZl2dZFV1T90vRNca46CpjOATOkxPomIGkZejDXWShN1WQm5JojE7owZAmVGuCA8shu3aISUcgVEpvndVflVTMWU1UUWcEpK33hyhzXhyzP6jaEoUXGqjz44FAg3uoq9wV2FaEKTd60WT3UbdUXdQhVXZVV1ng0SFv5rLBF7quAlrVUNQqLddFUNvgObtZl3fmsKftQVr6oqy7UU26aMsuqbFeWOX6CMUWem1zaUKDZjCrCoe2nZqimadqHss/mMObVft8em/6wG4apu/Rd1be5lxnMG4liLrPSITpT2TzoUKOK62bnss6I3Pkuc6UDWfS+hgeAsEChgUNInxgavTTZachTzlHW6FWvuNTBJJRI7QCwcFlVSJQ5cSHuAYlZIiz0vfegBTfWLvg8R2fmRWMbFTjjDkUOXIRFxgqF/kRnoEh0ojXWkayist6X1nmB5s58TpcwYyR1ZR5ql9cZqikLaK68aqt2rNvbvG/GYSjaZb/bz0s79tPYvW/rZgplZ5FiH9qxHNs876beB98OZYEgDMVhuCrR1iBigI86Lk1XlHkohnaYrvW89N3tsHz88G9f/fMf//SvX/7rq89ffv/H3/3hTz/+4acf/+nTl3/57tNPn3//+X75/vPnn4+X5eX8/efb8by/7U/98TRPz6fpMB9uLx9/2H2cl6U/7Kd+OBzmqb8Mu6kITXkY2rIpdsth2nfVML5//939H798uv/84fbl5w/n559uP/7wYR7HruvRLN52RYGceiG8J+QqFCgGFIofjUd/QQKQpDNZgYBJIOtAekiCRl0iUSoQ9WgaaqBFtIwDo6KzqU8pw0g/YMlz8G5GesIQ9aAgmMwBlaK/mKFLMBKIALAbc8ODrYgHpEbRB+VpiqBhLf25CnWUI++Z4z9mgcNAx9ZitIJnjCq74B1SIh2VTT7WGDHBVyi6rvG6ti3eC9BDiRR1rQWREJcWGCA1ajaAXsspYBaV6BrUWxinLKdhgwmAxgxZ0DmyG/rC2mroRAFiwnyvutqXIPkAXsmowzOP0q5DQTMOJGKxWAbcY7KyyYlITKF9XoNrgCNmtitrZdHy1tuyqgrrhULPh6wwisZAV+3gK4gD9FMf+tBM46ENVbWA7Yq2zCaXN2Nbtl3TgUuqfuhHeFY2lsowK7IeQYQiB2S1DUgYTd8cc9oSesHof9D/u7GcBAwTjAyMv5B465FFR1SZZzRIoQIwJ5Br4AFdgCIojK2gYBCgAuiwpkVVVU1TNzVRT15WRdejb5qicU1os7YIoOe2BRShcVmbFSC2oi77gviwaaoaTN2hSbGvqgBQ3fquru0weTCOGwZXV1lZhK4rSzBnW5TFYDqQbeiyvAkDRiRuBzKozQFjtQk5Dh2AcwPcQ1k2AWkqmwbMigOqDqf2oG0QQVEdxrLK67xoQ1NnFZLXBFsVE6qg3tVFCIFMNriS92NzOt6fL+3l9Dws728v1/l2WvrxuBwv79GLyzxej+9Pp3G+Xu7nfj4Ml3u/HN7fpyVMwKFpvK8LEBOiQMOWRXntT7v9ZffxfkMnn3fn+XI7LtN0PmN59zwtw3S6V+chZG3pS9yQV+VU1/vlcr6U8xHdfOyvp/fH+fr9fL897/vOFRgIQ6gaW/lQl0ORD7mn+VKiXvqmgs1dVeZDE0JbQ4EUvqwrpK0usrprIVvGvKlc0deA2lMy6qLygMFloSjKocXsycGFwLys2gEE24auziugFCbksKzKrgCz5l23n07n6fllvr/sz/N+oa/n4+mwO5/Pt+t5Ps3jYZkOh8u4u83L7n49fxiP8/JhPu+XeTkv3X2eL/sP0243347j4XRYLtNtXObnapo/Pj+38+mlBwLH/WG375f7YY9Tlufpu/Pz9bl7uezOw344Xe/jy3Le7Z6Xl5fb0N7uuw9Lt9yG47CMExj+dD/clut9fz00x/18+nQ5Xi/T6XT4dD3M4/na7ofj9dxdDsM8TfP+cNxNl6G+TLtl3r3Mx3E/44jjcbzAkVN7PC+Hw3C6wOL1vlwP574fR9x7Rn6QsfYwzff5dNmfD/Pudjmfepg/n6qlv+3h/ny93qdbP83XZSLvT4dxdzygrO7H5Xbazef7pYev87QggucfdndYP76/TefL9OF4vjyfjqi9HjADELh6u+zeL+fLiLMA3Xh4Xj7eLvfrMN/gA+rsPB/PLcyfpqxoIGXy/wOVg4FT

Currently, HdrHistogram_py fails to parse lines that contain a tag, since the regular expression for internal lines does not match them:

re_histogram_interval = re.compile(r'([\d\.]*),([\d\.]*),([\d\.]*),(.*)')

truncation warnings on install

https://developer.apple.com/library/mac/documentation/Darwin/Conceptual/64bitPorting/building/building.html

-Wshorten-64-to-32
... You should fix any warnings generated by this flag, as they are likely to be bugs.
$ sudo  pip install hdrhistogram

Collecting hdrhistogram
  Downloading hdrhistogram-0.3.1.tar.gz (47kB)
    100% |################################| 49kB 3.7MB/s 
    Installed /private/tmp/pip-build-2jisi2/hdrhistogram/pbr-1.8.1-py2.7.egg
    [pbr] Processing SOURCES.txt
    warning: LocalManifestMaker: standard file '-c' not found
Collecting pbr>=1.4 (from hdrhistogram)
  Downloading pbr-1.8.1-py2.py3-none-any.whl (89kB)
    100% |################################| 90kB 2.9MB/s 
Installing collected packages: pbr, hdrhistogram

  Running setup.py install for hdrhistogram
    [pbr] Generating AUTHORS
    [pbr] AUTHORS complete (0.0s)
    [pbr] Reusing existing SOURCES.txt
    building 'pyhdrh' extension
    cc -fno-strict-aliasing -fno-common -dynamic -arch x86_64 -arch i386 -g -Os -pipe -fno-common -fno-strict-aliasing -fwrapv -DENABLE_DTRACE -DMACOSX -DNDEBUG -Wall -Wstrict-prototypes -Wshorten-64-to-32 -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch x86_64 -arch i386 -pipe -I/System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c src/python-codec.c -o build/temp.macosx-10.9-intel-2.7/src/python-codec.o
    src/python-codec.c:166:20: warning: implicit conversion loses integer precision: 'uint64_t' (aka 'unsigned long long') to 'uint32_t' (aka 'unsigned int') [-Wshorten-64-to-32]
        array[index] = value;
                     ~ ^~~~~
    src/python-codec.c:187:9: warning: unused variable 'max_size' [-Wunused-variable]
        int max_size;
            ^
    src/python-codec.c:273:15: warning: unused variable 'res' [-Wunused-variable]
        PyObject *res;
                  ^
    src/python-codec.c:340:41: warning: implicit conversion loses integer precision: 'int64_t' (aka 'long long') to 'int' [-Wshorten-64-to-32]
                        max_nonzero_index = dst_index;
                                          ~ ^~~~~~~~~
    src/python-codec.c:342:45: warning: implicit conversion loses integer precision: 'int64_t' (aka 'long long') to 'int' [-Wshorten-64-to-32]
                            min_nonzero_index = dst_index;
                                              ~ ^~~~~~~~~
    5 warnings generated.
    src/python-codec.c:166:20: warning: implicit conversion loses integer precision: 'uint64_t' (aka 'unsigned long long') to 'uint32_t' (aka 'unsigned int') [-Wshorten-64-to-32]
        array[index] = value;
                     ~ ^~~~~
    src/python-codec.c:187:9: warning: unused variable 'max_size' [-Wunused-variable]
        int max_size;
            ^
    src/python-codec.c:273:15: warning: unused variable 'res' [-Wunused-variable]
        PyObject *res;
                  ^
    src/python-codec.c:340:41: warning: implicit conversion loses integer precision: 'int64_t' (aka 'long long') to 'int' [-Wshorten-64-to-32]
                        max_nonzero_index = dst_index;
                                          ~ ^~~~~~~~~
    src/python-codec.c:342:45: warning: implicit conversion loses integer precision: 'int64_t' (aka 'long long') to 'int' [-Wshorten-64-to-32]
                            min_nonzero_index = dst_index;
                                              ~ ^~~~~~~~~
    5 warnings generated.
    cc -bundle -undefined dynamic_lookup -arch x86_64 -arch i386 -Wl,-F. build/temp.macosx-10.9-intel-2.7/src/python-codec.o -o build/lib.macosx-10.9-intel-2.7/pyhdrh.so
Successfully installed hdrhistogram-0.3.1 pbr-1.8.1

Provide method to generate output for HdrHistogram Plotter

The Java version of HdrHistogram has an outputPercentileDistribution method that generates the output required by the HdrHistogram Plotter here: http://hdrhistogram.github.io/HdrHistogram/plotFiles.html

I reviewed the HdrHistogram_py codebase and could not find any similar functionality.

If I missed it, could you provide documentation on how to call it?

If it is not yet present, please consider this a request to add that functionality.

Thank you!

Encoding compatibility problem with HdrHistogram_c.

Here is a blob I generate with HdrHistogram_c's hdr_log_encode:

HISTFAAADCR4nE1Zu44dxxFlz/TVzmq14CxJgRRogk2agmUIgmw4EBwYahkMFNKxE31KzXBkNAUGo0fAcLjYQIEDfkKLgIENGTlWZKf+BFedquq5C+x99HS9Tj2772+WH2/fuPHgPzf0r7f3IC///vHqxpf/04X/npD8pXO8hUDDl0TDn3TxlF8mfr8pj84pHvjzfaLulCqvDAfsesxPo3IZgjHhv3xOKQh1lm2jLHU9HvP/XKkDRaZVREQlAZfxFLKVZpD9GU+JLg68+4jTIC9R37Cmy6MQC4sANVWht6lXbXt6zY/4y1J452oKjSZCRM2mWzK1VKTuA1f+e6ZL0UxWm03PKCzGULdmbVJ2WGDer5Pv4+eiFnSrzBmCFwa28OOe2QhCvpXqoAygQqQsVIJ84X2iaqbFCUZ3A78L/owbvxXqFGg3VfUrKxboTtNrKS6zdA7Bssja7DjA+1My5AfHQ0w3PKqSPlVW9VazgwDypewpi1gTJQayuS2I8WSaIhBo+ka+qLsrGzk0t8j/VgEyQy5mjD2tm3BZKsCGTH6/oDczeGy6fM2m9cAi7op1TBeZ5zab0wXjFYCvNCwrAYOfRCg/v+rI0eBPVVwohiImAMLJoh+SCC7Kcoti3hp4+/fwe1F3mC20BLxpXMFHk8isM7Yxw7mjtbp7EXb09ZIUn0nYnKxU2NNRWCfXDxYwTegbrCpWAke1Fz9WJHGdVwV+BdTZEq5TmBakxR1hspqOnbuVNU+Wou481nn0xJiBNPP7Ho9tfWimTEo77oCIQkUYjce5qMk8eQS4/Ko+o+tpJdiiVEH8J584zrcy2W5QauR/UmnKVEelJiT5+yL7sOBzB9VgTDU9us04FF3XcI0WpXQqWLcQDubJZZBweqW+mzMUL4OiCqWUuGStkSmMPVwugA8a5J0CFc1wXh1h3hqxk58Prg0oUCsg701C7A+SubvLAsiyBmKkZ5KTRwVvTv1MXisATTipThggzxLFPakZIzu6HhIttbLZNyOYxDBnUWRB6TVZRZkNqifXwwqss7oUI1P10NF1c+seixQaiQIrMatNhlc2M0M4q5yh971aKZh+84KP6OVAlCrWusUoMTIzJPywM/vDsUAzPlOrAxrf4ZsTsEf+SKAuwJ0mrupoE8Jl3aIoJBk7uUVUgqbuyv+nWKqR3Ne9N63otR8WfYCK1RwWw+rAjdS6KOKgUzhMbStj2uYRIq3Y0Wqu/s7W3EfBpNenAe1r6EeupoNBPGjgZTFdYyrKtq+KwpkPXGGH5sSE2KBWeJJSDwrpJsiJybWoP9QkpBM+degHQmn2Lq3OQ8XR8iQ7FFE88ZylvNPGtqm3eDyhactwwobKEx2GxeLr0ThIG6XrZPxWK50idQjSN6xroZQo0eDhKVGT0qwYADUU5WD70YIfK8gZzbJR8v5ibUfZB8eJjpstIzmJAkqzylxmEbJl66wFb99VIUtWquGzTqOUP97jt/K+7LxjgHE2bK1TS7AOOnFB8iRdhWvqRG1LBa/FUpQJZGPxbBpEiOgZ+mILLExrZWe+9dIEDjqbzDbGxQPKzlPvE1VI5dGThgXjNtC8dIpCj8Ws+FzxrnUhmT3DnsRqewDcKvkOed7qAipysYzrmHltg9dgJN2huaSzWmPoT2oWzO6obj7sSTEQ5fjrD/SzMLzklcQmFY1yoV8n1KFJUFl3jGXOy6cqAVO24P/YQnw17CImDY66Ho6P5rSrQQcbmaS2sWdTQI1qnZWh7h2ovQ5mYWvz6MqK4OJAHk0D04iaNHdtXVvIpJNq1dEE2drlfcw3UJvIYiK+6lf53PXUnjP9L+5x4F5U3bxvCS0esSyKvaCkhe5egBbevFFDn8kAAvBOd1bWb1Bg3pNPq070kjpZlDKnz4G+Ds1Be7STDcI2ON0ypNAJRJdEnyCdAhr7kkKyPkUxiwUlwzVV4SreTtWcWZ2icTA4OsUOUKOu/6oNixXk8gaNJu0fwRsP2fmNH2yzWl5MyarDmvoIxZR3zYZykta5aAPPrWQ7XlwpOw+n3mGymDIroOd4FDdR5nBwSdoyza2lHMXGJPSFPPwyQ0bLe7Jeu3aSnIPPUANZFkulv3SXgt3I5tnsYofTsLm3uQpgC9JplCQSLpbMzUf6DUa/axg4FIdm7PZE16fgu4vEyLS4OvsMMbWlknTaixDWyeC7aWDJ0BgGr4E4Xm+RcKQp1TT8RnuXJPuUwWEdTVAF7tKrkumz4eFwNF6tR/NhovIvjeJp0TOGjU+okXWfDZRG8PdmkxsTTEEL0iZIxaMXwbvs3/DURH9QXQUHwyZES6mIiMrmgzkVmqcoDXJVx/+B5qg9ZKWqMeSsOTS4dmsFEnAZgmS+YsXXmRB18oXxoLjhiLZsQNxc0oGhDio2u+DMj3Baxt3hr/1+ISia/Pd+0MaHagAesMBmGvk+z0MYmJsMvZXSO576vp1fCN8qR9jaUh/jw4HqItcf03UQHX8+OlBtXtk0SNCu2sBQWpZKZ3mLYrZVaLNEDmMfvxa6Y8VBmGJIS2Z4+SvL1l1vsUbv8RdcTICkZs3QW0I7S0Pz1MiYlZhMYwXE9amAWrLPbkO/Te+immKMcsjN/e5RK0ok/X5DnMx1n4+sgSTab1cK8EjwBk7g05tkHKB6kNdpBI9JmUiYbEVP9gbmU4F39rOZBkTPnp3mNvs9gcyKCxq9XKPnAMyGCoVKRi2cxV5KVr+wSxV65alU5pdrkri+XnC3Mi8InddSJZHzvBSKxq6E1VueP/ls8ZJr2taJkrDhUvYv9HZbebr8VcurJPu2PBGlN8uOWqdQoZweBTj0NmD4D876uG4SF4IjjOGeUqrPlqwJNwSh3aD5d2L3/OqeaDpp7/NRetn0ML2fn8qGoj/vh455ybjbQV2ZvqW1kybEo+pKP+SqIdM5udS3SY/DdWbYhW5loT/RWmx8vBU0Bq8uH0jhq5Ku9XqawlpSi6UqXv1FesbzUkRH8eWCSz0tdkezEK2cTLOco1/KHY2oXMzNAHep9zTm1lqF4gdtoWv5Z0Q/5dlVjwE8ZVYfk4Mf7NHQpOLSlRbNloK8QcJBdHjpqliNkW/X03x0YiXMvxvsmB8FhJHeZxZ4AxMTBTsSGp3hwVw3i16VE3XI6GkfcPSyUG8Ay4I7WKSEPwzo/9MAJTT3Jkj9iYNz8PkQN4kHvwHgWabSLuHZCS3z3le01dpA2bu2nR5f2nhik1ULkTabRrmnmNVB8iwckpucg19LxuDT/IdsQuGy+UC+WWKiwOicrALaNVvrfTaD5azTjd2FB7OOP941RcXQfNaYoVJ4OTuY8uMJwPMbZ+ENaLpTpcVRRNRT+XKRP9w0czvXZRWj4OU26vL/IwyMGHPjhwYQvDaeNSTlTX8w0EIL2abvFuyCxS5a4olOlCcyQ54YlIvfBxzyp4I6xHU3RQO/boFp4+/1FwfqHppItU7v/zJcoCFI44VC8ZASfr3oPqPuzADqjGHJ520w6xQmmDjgNuYv/OG+/ZYRP9b4HhWC7kEDaPzUrmOC2SKPP96VYkF/NonMKd3WpTMH/++kUtJH8u1ihPLi3+GBscvyROd/m96BjrSnvl3VUgT+44kloGJgQXUgtxBbZ4kvcXlCyY16qPJfTc7Jr0t4w4UytZD3qDsz+VBwJbuxxzFTzhln3klnka9BM1ILTtXKbi2YVycVQCMBMM37dZXXCFnmcLPcZePkU9id7MXtzG0fJcZHne+D7IF1rFZ62En4hI+OLv78OD4+RAVvBWNoLEeOxHzzaBTy1/Sh8moG8EK8OAIdlmSk2sAJHW87xJR+q/CK8x2c/NCQDV+4WYjAu+CPQNhvc0Bw5ipi2UtEwDFe3LzaSLPumH7u9lG6S56/Qjne9E/m4xFXeoufUYRBtZ+kWIIApeHKgViMWpT+HKYKi3jA7LFprYHig6GOqtBZSVC1k3KQHeKTod0z5s893Cho7N+3oNdslOQ0VDxihH2r67wrCnYK4DII1QjR4b4hoXGkbewzPPmCMkMfznXDiFo0foxEtEAOf5TXOd/mvBUtbqasAvk0p1f4v8M3vSKIt00dqQJnPh1RvstzER6poGgHpmgVAtioet1dQr06b9eme2LjDYo+Ip0PgqDJQaY/cB1ogPERyuT/AxTDpLI=

HdrHistogram_c can decode this blob well:

  std::string payload = "..."; // As below

  hdr_histogram *hist = nullptr;
  hdr_log_decode(&hist, (char *)payload.c_str(), payload.size());

  std::cout << hdr_min(hist) << std::endl; // 500
  std::cout << hdr_max(hist) << std::endl; // 11023
  std::cout << hdr_value_at_percentile(hist, 50) << std::endl; // 2119
  std::cout << hdr_value_at_percentile(hist, 90) << std::endl; // 6947

But when I try to decode in HdrHistogram_py, the result is wrong:

IN [2]: h = HdrHistogram.decode('...') # As below

In [3]: h.get_value_at_percentile(90)
Out[3]: 500

In [4]: h.get_min_value()
Out[4]: 500

In [5]: h.get_max_value()
Out[5]: 11023

In [6]: h.get_value_at_percentile(50)
Out[6]: 500

In [7]: h.get_value_at_percentile(90)
Out[7]: 500

Ability to create html files via the Python Histogram API

Hello,
In the past I used the https://github.com/HdrHistogram/HdrHistogram/blob/master/GoogleChartsExample/index.html to create pretty html charts from the histograms.

The Google chart example requires the input file be in a particular format https://github.com/HdrHistogram/HdrHistogram/blob/master/GoogleChartsExample/example1.txt

The Java API nicely generates with the write methods however I am not able to locate an equivalent Python API. Please could you consider adding this?

I am happy to code up/test/feedback/contribute if necessary

Python interpreter crashes on decoding empty histogram

This can be reproduced with the following (Python 3.5.2 on win32):

>>> from hdrh.histogram import HdrHistogram
>>> hist = HdrHistogram(1, int(3e6), 1)
>>> data = hist.encode()
>>> data
b'HISTFAAAABZ4nJNpmSzMgACMKLTusQMwPgBBywL9'
>>> HdrHistogram.decode(data)

similarly:

>>> from hdrh.histogram import HdrHistogram
>>> hist = HdrHistogram(1, int(3e6), 1)
>>> data = hist.encode().decode()
>>> data
'HISTFAAAABZ4nJNpmSzMgACMKLTusQMwPgBBywL9'
>>> HdrHistogram.decode(data)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.