ucbrise / confluo Goto Github PK

View Code? Open in Web Editor NEW

1.4K 106.0 203.0 11.55 MB

Real-time Monitoring and Analysis of Data Streams

Home Page: https://ucbrise.github.io/confluo/

License: Apache License 2.0

CMake 1.89% C++ 91.68% Shell 0.42% C 0.02% Thrift 0.27% Python 2.52% Java 3.21%

lock-free log atomic

confluo's Introduction

Confluo

Confluo is a system for real-time monitoring and analysis of data, that supports:

high-throughput concurrent writes of millions of data points from multiple data streams;
online queries at millisecond timescale; and
ad-hoc queries using minimal CPU resources.

Please find detailed documentation here.

Installation

Required dependencies:

MacOS X or Unix-based OS; Windows is not yet supported.
C++ compiler that supports C++11 standard (e.g., GCC 5.3 or later)
CMake 3.2 or later
Boost 1.58 or later

For python client, you will additionally require:

Python 2.7 or later
Python Packages: setuptools, six 1.7.2 or later

For java client, you will additionally require:

Java JDK 1.7 or later
ant 1.6.2 or later

Source Build

To download and install Confluo, use the following commands:

git clone https://github.com/ucbrise/confluo.git
cd confluo
mkdir build
cd build
cmake ..
make -j && make test && make install

Using Confluo

While Confluo supports multiple execution modes, the simplest way to get started is to start Confluo as a server daemon and query it using one of its client APIs.

To start the server daemon, run:

confluod --address=127.0.0.1 --port=9090

Here's some sample usage of the Python API:

import sys
from confluo.rpc.client import RpcClient
from confluo.rpc.storage import StorageMode

# Connect to the server
client = RpcClient("127.0.0.1", 9090)

# Create an Atomic MultiLog with given schema for a performance log
schema = """{
  timestamp: ULONG,
  op_latency_ms: DOUBLE,
  cpu_util: DOUBLE,
  mem_avail: DOUBLE,
  log_msg: STRING(100)
}"""
storage_mode = StorageMode.IN_MEMORY
client.create_atomic_multilog("perf_log", schema, storage_mode)

# Add an index
client.add_index("op_latency_ms")

# Add a filter
client.add_filter("low_resources", "cpu_util>0.8 || mem_avail<0.1")

# Add an aggregate
client.add_aggregate("max_latency_ms", "low_resources", "MAX(op_latency_ms)")

# Install a trigger
client.install_trigger("high_latency_trigger", "max_latency_ms > 1000")

# Load some data
off1 = client.append([100.0, 0.5, 0.9,  "INFO: Launched 1 tasks"])
off2 = client.append([500.0, 0.9, 0.05, "WARN: Server {2} down"])
off3 = client.append([1001.0, 0.9, 0.03, "WARN: Server {2, 4, 5} down"])

# Read the written data
record1 = client.read(off1)
record2 = client.read(off2)
record3 = client.read(off3)

# Query using indexes
record_stream = client.execute_filter("cpu_util>0.5 || mem_avail<0.5")
for r in record_stream:
  print r

# Query using filters
record_stream = client.query_filter("low_resources", 0, sys.maxsize)
for r in record_stream:
  print r

# Query an aggregate
print client.get_aggregate("max_latency_ms", 0, sys.maxsize)

# Query alerts generated by a trigger
alert_stream = client.get_alerts(0, sys.maxsize, "high_latency_trigger")
for a in alert_stream:
  print a

Contributing

Please create a GitHub issue to file a bug or request a feature. We welcome pull-requests, but request that you review the pull-request process before submitting one.

Please subscribe to the mailing list ([email protected]) for project announcements, and discussion regarding use-cases and development.

confluo's People

Contributors

Stargazers

Watchers

Forkers

attaluris appcoreopc lzpfmh lusong1986 bunnyrabbit8mile slpcat tomzhang tisan2021 satchel9 servicefoundation hcm517617595 abioy oldtree rongliang666 zhangyang520 szbalx tg0428 handong890 etsangsplk minghao2016 skyformat99 tiger902 foxxnuaa chenkangcrack eagle518 fcccode anglenet delkyd sunbiaozj jackiezhang2017 navono guohf cube3power skymysky duhenglucky shigongchen randomye rim99 scanry zhoudaqing programlife555 zerolugithub linecode newlysoft dut3062796s zqhyz stamhe jangocheng giddily fhaoquan landcever luffyzhang leeoo numberen hj3938 lyzw charlieyqin vincentchen lidi100 xubingyue leaderyangzi ycj007 code-mx china-liweihong raybbb zoelov shefronyudy zzqch frankie34 shikun0818 shaunstanislauslau pureqsh allanceng restmad yuejw gcoinman mbrukman awesome-nfv bairenjunior yubobo joe2hpimn fishgege pythonlinuxmanager yinoliver zmyer 280185386 yhtsnda unbreakablehf ruanjianershu yuguangjie forging2012 412537896 ujvl wwaww mu-l happywumin codreamx davidmr001 gitqifan haxine

confluo's Issues

More thorough testing of compression

Need to write more tests to cover edge cases of delta and lz4 encoding/decoding
- Different array sizes
- Different data sets (current tests are monotonically increasing elements with constant increments)

Improve storage allocation using pools and file consolidation.

Improve allocation using pools and file consolidation.

More intelligent mmapping by storage_allocator

On Linux machines the vm.max_map_count limit is hit very quickly (default on an ec2 c4.8xlarge instance is 65536). This needs to be manually increased for archival to be successful (since reflogs have relatively small buckets and each bucket is memory mapped).

Instead the storage_allocator could potentially memory map larger-than-requested chunks of files and store the excess of the available range for future calls. It would also have to map at a different granularity depending on what's being mmapped (data log vs reflog buckets). This could be done by just looking at the size passed to the mmap call.

Separate debugging build

Create a separate build process for debugging that enables the debug flag and does not enable compiler optimizations.

Add tests for persistence recovery

These tests would check if we can successfully recover a table from an on-disk version of the data.

Filter using a function

Allow filters to use a function instead of just an expression. The filter class already accepts a function in the constructor but there's currently no way for an RPC client to specify a function, so the default_filter is always used.

Add SQL interface

Add (potentially modified) SQL interface for supporting queries, adding/removing filters/indexes/triggers, etc.

Reflect index/filter removal in persisted metadata

If a filter or index was invalidated it shouldn't be valid again on recovery/restart.

Support nullable types

Support nullable types for records that may have missing values.

More tests for C++ writer client

The rpc_dialog_writer hasn't had remove_trigger, remove_filter or remove_index tested. Add test cases to the C++ writer client test to cover this functionality.

Detect missing Python client build dependencies

If possible, in the CMakeLists.txt check if all of the dependencies for the Python client are installed.

Currently if a Python dependency is missing, it isn't detected and the build fails:

Traceback (most recent call last):
File "setup.py", line 3, in <module>
from setuptools import setup
ImportError: No module named setuptools
pyclient/CMakeFiles/pyclient_build.dir/build.make:57: recipe for target 'pyclient/CMakeFiles/pyclient_build' failed
make[2]: *** [pyclient/CMakeFiles/pyclient_build] Error 1
CMakeFiles/Makefile2:1281: recipe for target 'pyclient/CMakeFiles/pyclient_build.dir/all' failed
make[1]: *** [pyclient/CMakeFiles/pyclient_build.dir/all] Error 2
Makefile:160: recipe for target 'all' failed
make: *** [all] Error 2```

Provide User Guide using Python

The User Guide at https://ucbrise.github.io/confluo/ could use a full Python version. At this time it's completely unclear how to use Confluo from Python.

Add generic parsing framework

A generic parsing framework would eliminate code redundancy in parsing:

Schema
Filter
Trigger
Alert

Command line parser parses short options incorrectly

Command line parser sometimes maps short options to completely unrelated arguments.

Switch to thrift v0.11.0

This resolves the java bug where it prints "Received n" for every RPC call.

Move read and write buffering to streaming interface

Currently the client exposes buffered reads and writes that assume sequential access patterns. These should be moved to a separate streaming interface.

Add trigger time-period as a configurable parameter

Right now, the time-period for trigger evaluation defaults to whatever the filter granularity is. We should dissociate the two and make sure the monitor thread correctly aggregates required filters at the proper time-period.

Range query bug

Adhoc filter is creates an empty plan for queries on long fields with no upper bound (e.g., value >= 100). This may potentially extend to other types, have not confirmed.

Memory pool thread id

A reader thread freeing memory does not know the id of the writer thread that allocated the pointer in the first place.

Memory size parser

Add a parser for memory sizes, e.g, the values provided could be 10k or 100M or 5g, and it would get parsed to the appropriate number of bytes.

size_t configuration_params::MAX_MEMORY = dialog_conf.get<size_t>(
    "max_memory", constants::DEFAULT_MAX_MEMORY);

Improve documentation

Improve and add documentation where none exists in several header files, including but not limited to:

storage.h
monolog_linear.h
monolog_exp2.h
dialog_table.h
expression.h
tiered_index.h

Add support for user-defined types

How to use Confluo as a pubsub system?

The documents do not mention about these, and there is nothing about subscription in the client API.
How to use Confluo as a pubsub system?

Where to set a maven repo proxy.

Hi， I'm trying to make install confluo. However, the installation has some errors to get maven repo. I have a maven proxy (mirror). Where can I set it in the conflue source?

Consistent parameters across Confluo servers and clients

Some parameters shared between the server and the client (e.g., TIME_RESOLUTION_NS) can become inconsistent if either the server or the client changes them. We need to add a mechanism to ensure consistency for these parameters.

i have a problem on install confluo

Determining if the pthread_create exist failed with the following output:
Change Dir: /root/confluo_test/confluo/build/CMakeFiles/CMakeTmp

Run Build Command:"/usr/bin/gmake" "cmTC_6011a/fast"
/usr/bin/gmake -f CMakeFiles/cmTC_6011a.dir/build.make CMakeFiles/cmTC_6011a.dir/build
gmake[1]: Entering directory /root/confluo_test/confluo/build/CMakeFiles/CMakeTmp' Building C object CMakeFiles/cmTC_6011a.dir/CheckSymbolExists.c.o /usr/bin/cc -o CMakeFiles/cmTC_6011a.dir/CheckSymbolExists.c.o -c /root/confluo_test/confluo/build/CMakeFiles/CMakeTmp/CheckSymbolExists.c Linking C executable cmTC_6011a /usr/local/bin/cmake -E cmake_link_script CMakeFiles/cmTC_6011a.dir/link.txt --verbose=1 /usr/bin/cc -rdynamic CMakeFiles/cmTC_6011a.dir/CheckSymbolExists.c.o -o cmTC_6011a CMakeFiles/cmTC_6011a.dir/CheckSymbolExists.c.o: In function main':
CheckSymbolExists.c:(.text+0x16): undefined reference to pthread_create' collect2: error: ld returned 1 exit status gmake[1]: *** [cmTC_6011a] Error 1 gmake[1]: Leaving directory /root/confluo_test/confluo/build/CMakeFiles/CMakeTmp'
gmake: *** [cmTC_6011a/fast] Error 2

File /root/confluo_test/confluo/build/CMakeFiles/CMakeTmp/CheckSymbolExists.c:
/* */
#include <pthread.h>

int main(int argc, char** argv)
{
(void)argv;
#ifndef pthread_create
return ((int*)(&pthread_create))[argc];
#else
(void)argc;
return 0;
#endif
}

Determining if the function pthread_create exists in the pthreads failed with the following output:
Change Dir: /root/confluo_test/confluo/build/CMakeFiles/CMakeTmp

Run Build Command:"/usr/bin/gmake" "cmTC_f4480/fast"
/usr/bin/gmake -f CMakeFiles/cmTC_f4480.dir/build.make CMakeFiles/cmTC_f4480.dir/build
gmake[1]: Entering directory /root/confluo_test/confluo/build/CMakeFiles/CMakeTmp' Building C object CMakeFiles/cmTC_f4480.dir/CheckFunctionExists.c.o /usr/bin/cc -DCHECK_FUNCTION_EXISTS=pthread_create -o CMakeFiles/cmTC_f4480.dir/CheckFunctionExists.c.o -c /usr/local/share/cmake-3.13/Modules/CheckFunctionExists.c Linking C executable cmTC_f4480 /usr/local/bin/cmake -E cmake_link_script CMakeFiles/cmTC_f4480.dir/link.txt --verbose=1 /usr/bin/cc -DCHECK_FUNCTION_EXISTS=pthread_create -rdynamic CMakeFiles/cmTC_f4480.dir/CheckFunctionExists.c.o -o cmTC_f4480 -lpthreads /usr/bin/ld: cannot find -lpthreads collect2: error: ld returned 1 exit status gmake[1]: *** [cmTC_f4480] Error 1 gmake[1]: Leaving directory /root/confluo_test/confluo/build/CMakeFiles/CMakeTmp'
gmake: *** [cmTC_f4480/fast] Error 2

an error on complie confluo

i complie confluo occur an error, and i already read other complie error issues, but no one is my problem, colud you help me? thank you very much.

error message:
In file included from /root/work/confluo/libconfluo/../libutils/utils/atomic.h:7:0,
from /root/work/confluo/libconfluo/confluo/archival/archival_utils.h:4,
from /root/work/confluo/libconfluo/src/archival/archival_utils.cc:1:
/usr/include/c++/4.8.2/atomic: In instantiation of ‘struct std::atomic<confluo::storage::encoded_ptr >’:
/root/work/confluo/libconfluo/confluo/storage/swappable_encoded_ptr.h:367:32: required from ‘class confluo::storage::swappable_encoded_ptr’
/root/work/confluo/libconfluo/src/archival/archival_utils.cc:17:42: required from here
/usr/include/c++/4.8.2/atomic:167:7: error: function ‘std::atomic<_Tp>::atomic() [with _Tp = confluo::storage::encoded_ptr]’ defaulted on its first declaration with an exception-specification that differs from the implicit declaration ‘std::atomic<confluo::storage::encoded_ptr >::atomic()’
atomic() noexcept = default;
^
make[2]: *** [libconfluo/CMakeFiles/confluo.dir/src/archival/archival_utils.cc.o] Error 1

my enviroment configure
boost 1.69
gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)

i only build RPC component, others all OFF

Switch to dynamically sized records

Currently the record size for a schema is fixed; switch to dynamically sized records.

Add LZ4 and delta compressors to Confluo Archival framework.

Expose Sketch API for multilog

Expose Sketch API to be able to:

Add/remove a sketch on a field
Execute ad-hoc queries on frequencies
Evaluate approximate triggers in real-time

Expose corresponding remote API.

Monolog compatible with mempool design

The current reflog is a monolog_exp2, which has exponentially growing buckets. This would require a memory pool for each bucket. Instead, the reflog should have exponentially growing bucket containers that have pointers to fixed size buckets. The buckets can then be allocated using a single memory pool.

JSON interface for RPC Client

Currently the client must read from and write to a table using a contiguous string buffer. Add an interface to the C++ and Python RPC clients to read and write using JSON for easier usability. Validate all fields when a record is written using this interface.

Clustering/scale-out roadmap?

Hi there,

I see there's been some activity in a branch called chain-replication but it'd be nice to know what you folks have in mind for scale-out architecture. :)

Possible memory leak

Appending to an atomic multilog even when it has no filters and indexes may have a large memory overhead despite only having to update a single data structure (the data log). Need to profile to discover possible memory leak(s).

Archival for data, filters and indexes

Archive data, filters and indexes periodically and when hitting a memory limit.

Remove double copy in server side record batches

Currently, there is data is double copied at the server side:

once while reading it from thrift's serialized format
once while converting it to dialog's expected format
This can be collapsed into a single copy with ease, and should lead to performance improvements.

More alert functionality in dialog_server

Get alerts by time and trigger name
Subscribe to alerts
Active alerts
Corresponding C++/Python client methods and tests

Cleanup Query Planner

The current query planner has a few issues that need to be resolved:

Does not work if there are no indexes for a minterm (yields zero results)
Reduce filter is hard-coded
Does not handle all operators (e.g., !=)

Aggregates as double precision values

Store/compute all aggregates as double precision values rather than the specific types of aggregated attribute. This avoids issues of overflow, etc for smaller types.

Monitor thread may miss alerts

Possibly caused by atomic multilog append latency.

storage_mode usage

storage_mode is used by both the read tail and data log, so the current in_memory mode allocates using a mempool only if the size requested matches; otherwise it calls malloc. Need a cleaner way to make the allocations.

Universal sketches, approximate queries

Implement count-min-sketch (supposedly more space-efficient than count-sketch)
Time-adaptive sketch: decay accuracy & storage for older values (using inflation)
Integrate with atomic multilog for offline/online approximate queries.
- Figure out what the basic interface should look like
- Consider:
  - cross-stream correlation queries
  - use-cases where the same stream is partitioned across multiple servers
  - storage-accuracy tradeoff, blowfish

Trigger evaluation should be agnostic to application timestamps

Currently, the trigger evaluation framework uses realtime (more precisely, server time) to determine which time-buckets to check for trigger evaluation. This can potentially be an issue if the application supplies the timestamps, and these timestamps have no correlation with server time.

Support Golang client and demo.

Add configuration files

Radix tree node compatibility with mempool

The way the current radix_tree_node is written does not permit for a simple mempool allocation of the entire struct.

Add tests for contiguous read/write monolog methods

The following methods of the monolog implementations need to be tested if they will be used in the future.

void set(size_t idx, const T* data, size_t len);
void set_unsafe(size_t idx, const T* data, size_t len);
void get(T* data, size_t idx, size_t len);
size_t storage_size();

Support projections

Support projections for cases where not all columns are required in the output of a query.

Support for flexible schema

Currently, Confluo requires a fixed schema for each atomic multilog, which restricts different applications like network monitoring and diagnosis. We want to support flexible schemas (e.g., JSON) to facilitate these applications.

i have a problem on complie confluo

[ 32%] Building CXX object libconfluo/CMakeFiles/confluo.dir/src/confluo_store.cc.o
In file included from /home/root/C++/confluo/libconfluo/confluo/aggregate/aggregate_info.h:7:0,
from /home/root/C++/confluo/libconfluo/confluo/aggregated_reflog.h:5,
from /home/root/C++/confluo/libconfluo/confluo/filter.h:4,
from /home/root/C++/confluo/libconfluo/confluo/archival/load_utils.h:7,
from /home/root/C++/confluo/libconfluo/confluo/atomic_multilog.h:11,
from /home/root/C++/confluo/libconfluo/confluo/confluo_store.h:8,
from /home/root/C++/confluo/libconfluo/src/confluo_store.cc:1:
/home/root/C++/confluo/libconfluo/confluo/parser/aggregate_parser.h:32:27: error: expected identifier before ‘(’ token
(std::string, agg)
^
/home/root/C++/confluo/libconfluo/confluo/parser/aggregate_parser.h:32:41: error: ‘agg’ has not been declared
(std::string, agg)
^
/home/root/C++/confluo/libconfluo/confluo/parser/aggregate_parser.h:33:23: error: ‘field_name’ has not been declared
(std::string, field_name))
^
/home/root/C++/confluo/libconfluo/confluo/parser/aggregate_parser.h:33:33: error: ‘parameter’ declared as function returning a function
(std::string, field_name))
^
/home/root/C++/confluo/libconfluo/confluo/parser/aggregate_parser.h:35:1: error: expected constructor, destructor, or type conversion before ‘namespace’
namespace confluo {
^
make[2]: *** [libconfluo/CMakeFiles/confluo.dir/src/confluo_store.cc.o] Error 1
make[1]: *** [libconfluo/CMakeFiles/confluo.dir/all] Error 2
make: *** [all] Error 2

GCC VERSION:
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.8.5/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin --enable-initfini-array --disable-libgcj --with-isl=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/isl-install --with-cloog=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/cloog-install --enable-gnu-indirect-function --with-tune=generic --with-arch_32=x86-64 --build=x86_64-redhat-linux
Thread model: posix
gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC)

BOOST VERSION:
1.53.0

LOCATE :
boost: /usr/include/boost