Giter VIP home page Giter VIP logo

phxqueue's Introduction

PhxQueue

简体中文README

PhxQueue is a high-availability, high-throughput and highly reliable distributed queue based on the Paxos protocol. It guarantees At-Least-Once Delivery. It is widely used in WeChat for WeChat Pay, WeChat Media Platform, and many other important businesses.

Authors: Junjie Liang, Tao He, Haochuan Cui, Qing Huang and Jiatao Xu

Contact us: [email protected]

Build Status

Features

  • Guaranteed delivery with strict real-time reconciliation

  • Server-side batch enqueue

  • Strictly ordered dequeue

  • Multiple consumer groups

  • Dequeue speed limits

  • Dequeue replays

  • Consumer load balancing

  • All modules are scalable

  • Multi-region deployment for Store or Lock nodes

Building automatically

git clone https://github.com/Tencent/phxqueue
cd phxqueue/
bash build.sh

Now that all modules are built, you can continue to quickstart.

Building manually

Download PhxQueue source

Download the phxqueue.tar.gz and un-tar it to $PHXQUEUE_DIR.

Install dependencies

  • Prepare the $DEP_PREFIX diectory for dependency installation:

    export $DEP_PREFIX='/your/directory/for/dependency'
  • Protocol Buffers and glog

    Build Protocol Buffers and glog with ./configure CXXFLAGS=-fPIC --prefix=$DEP_PREFIX. Then create symlinks:

    rm -r $PHXQUEUE_DIR/third_party/protobuf/
    rm -r $PHXQUEUE_DIR/third_party/glog/
    ln -s $DEP_PREFIX $PHXQUEUE_DIR/third_party/protobuf
    ln -s $DEP_PREFIX $PHXQUEUE_DIR/third_party/glog
  • LevelDB

    Build LevelDB in $PHXQUEUE_DIR/third_party/leveldb/ and ln -s out-static lib.

  • PhxPaxos and PhxRPC

    Build PhxPaxos in $PHXQUEUE_DIR/third_party/phxpaxos/. Build PhxRPC in $PHXQUEUE_DIR/third_party/phxrpc/.

  • libco

    Git clone libco to $PHXQUEUE_DIR/third_party/colib/.

Compile PhxQueue

cd $PHXQUEUE_DIR/
make

PhxQueue distribution

PhxQueue is structured like this:

phxqueue/ ................. The PhxQueue root directory
├── bin/ .................. Generated binary files
├── etc/ .................. Example configuration files
├── lib/ .................. Generated library files
├── phxqueue/ ............. PhxQueue source files
├── phxqueue_phxrpc/ ...... PhxQueue with PhxRPC implementation
└── ...

the output files are located in bin/ and lib/, while the sample configure files are located in etc/.

Quickstart

The built PhxQueue is ready to run simple demos.

Preparing the setup

PhxQueue accesses multiple files at the same time. Make sure to set high enough (> 4000) open file limit with ulimit -Sn or ulimit -n.

Starting the Store nodes

Start 3 Store nodes (add -d if run as daemon) as shown bellow:

bin/store_main -c etc/store_server.0.conf
bin/store_main -c etc/store_server.1.conf
bin/store_main -c etc/store_server.2.conf

You can follow the status of the nodes and check for any errors in these log files as shown bellow:

ps -ef | grep store_main
tail -f log/store.0/store_main.INFO
tail -f log/store.1/store_main.INFO
tail -f log/store.2/store_main.INFO

NOTICE: To run properly, at least 2 Store nodes need to be started, otherwise error log will occur below:

MASTERSTAT: ERR: Propose err. paxos_ret 404 ...

Starting the Consumer nodes

Start 3 Consumer nodes:

bin/consumer_main -c etc/consumer_server.0.conf
bin/consumer_main -c etc/consumer_server.1.conf
bin/consumer_main -c etc/consumer_server.2.conf

You can follow the status of the nodes and check for any errors in these log files as shown bellow:

ps -ef | grep consumer_main
tail -f log/consumer.0/consumer_main.INFO
tail -f log/consumer.1/consumer_main.INFO
tail -f log/consumer.2/consumer_main.INFO

Sending test requests

Now that both Store and Consumer nodes have been deployed, you can use the benchmark tool to send some test requests:

bin/test_producer_echo_main

You will get the output from test Producer:

produce echo succeeded!

Now let's see the output of Consumer (only 1 of 3 Consumer nodes):

consume echo succeeed! ...

Running benchmarks

bin/producer_benchmark_main 10 5 5 10

Watch the Consumer log files:

tail -f log/consumer.0/consumer_main.INFO
tail -f log/consumer.1/consumer_main.INFO
tail -f log/consumer.2/consumer_main.INFO

This is an example of the output you can expect from the Consumer log:

INFO: Dequeue ret 0 topic 1000 consumer_group_id 1 store_id 1 queue_id 44 size 1 prev_cursor_id 9106 next_cursor_id 9109

Clearing data and logs created during testing

While testing PhxQueue, a lot of logs and data is generated. Run log/clear_log.sh to clear logs and data/clear_data.sh to delete data. Make sure that you are running these commands against PhxQueue that does not hold any important data. Commands listed here will result in permanent data loss.

Quickstart with Docker

Quickstart with Docker

Deploy distributed PhxQueue

Normally, each node should be deployed on separate machine. You need to configure etc/*.conf configuration files for each node.

Files located in directory etc/:

globalconfig.conf .................Global config
topicconfig.conf ................. Topic config
storeconfig.conf ................. Store config
consumerconfig.conf ...............Consumer config
schedulerconfig.conf ..............Scheduler config
lockconfig.conf ...................Lock config

Deloy and modify these files on all target machines.

Deploying Store nodes

Store is the storage module for queues, using the Paxos protocol for replica synchronization.

Deploy these configs to 3 Store nodes and start each node:

bin/store_main -c etc/store_server.0.conf -d
bin/store_main -c etc/store_server.1.conf -d
bin/store_main -c etc/store_server.2.conf -d

Deploying Consumer nodes

Consumer pulls and consumes data from Store.

Deploy these configs to 3 Consumer nodes and start each node:

bin/consumer_main -c etc/consumer_server.0.conf -d
bin/consumer_main -c etc/consumer_server.1.conf -d
bin/consumer_main -c etc/consumer_server.2.conf -d

Deploying Lock nodes (Optional)

Lock is a distributed lock module. You can deploy Lock independently, providing a common distributed lock service.

Set skip_lock = 1 in topicconfig.conf to disable distributed Lock.

Deploy these configs to 3 Lock nodes and start each node:

bin/lock_main -c etc/lock_server.0.conf -d
bin/lock_main -c etc/lock_server.1.conf -d
bin/lock_main -c etc/lock_server.2.conf -d

Deploying Scheduler nodes (Optional)

Scheduler gathers global load information from Consumer for disaster recovery and load balancing. If no Scheduler is deployed, Consumer will be assigned according to weight configured.

If you want to deploy Scheduler, you will need to deploy Lock first.

Set use_dynamic_scale = 0 in topicconfig.conf to disable Scheduler.

Deploy these configs to 3 Scheduler nodes and start each node:

bin/scheduler_main -c etc/scheduler_server.0.conf -d
bin/scheduler_main -c etc/scheduler_server.1.conf -d
bin/scheduler_main -c etc/scheduler_server.2.conf -d

Viewing logs

For each node, there is a log file where you can trace current node status and errors. For example, you can access log file for Store node with ID 0 like shown bellow:

tail -f log/store.0/store_main.INFO

Contribution

Please follow Google C++ Style Guide in PRs.

phxqueue's People

Contributors

addvilz avatar cnwuwil avatar hungmingwu avatar taohexxx avatar unixliang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

phxqueue's Issues

执行build.sh编译,报错

根据最新master的代码,执行build.sh编译,报错如下。

  • autoreconf -f -i -Wall,no-obsolete
    configure.ac:93: error: possibly undefined macro: AC_PROG_LIBTOOL
    If this token and others are legitimate, please use m4_pattern_allow.
    See the Autoconf documentation.
    autoreconf: /usr/bin/autoconf failed with exit status: 1
    checking whether to enable maintainer-specific portions of Makefiles... yes
    configure: error: cannot find install-sh, install.sh, or shtool in "." "./.." "./../.."
    make: *** 没有指明目标并且找不到 makefile。 停止。
    ./autoinstall.sh: 第 84 行:cd: protobuf: 没有那个文件或目录
    protobuf install fail. please check compile error info.

但是自行下载pb3.0.0版本单独编译之后,可以联编通过。

没有C的api吗?

没有C的api吗?现在如何使用了? 编译,运行,测试工具都ok了

使用test_producer_echo_main发送了3个消息,3个consumer_main都没有输出

使用test_producer_echo_main发送了3个消息,3个consumer_main都没有输出
都部署在同一台机器上
使用test_producer_echo_main发送三个消息都消息成功
[root@iZuf6hx16c6lym9fxx51xcZ phxqueue]# bin/test_producer_echo_main
produce echo "Pisz1QF7YK" succeeded!
[root@iZuf6hx16c6lym9fxx51xcZ phxqueue]# bin/test_producer_echo_main
produce echo "yGGtFBHAjL" succeeded!
[root@iZuf6hx16c6lym9fxx51xcZ phxqueue]# bin/test_producer_echo_main
produce echo "Re3r9t3xT1" succeeded!

consumer_main1
bin/consumer_main -c etc/consumer_server.0.conf
consume echo "VDPvMIUOgC" succeeded! sub_id 1 store_id 1 queue_id 29 item_uin 0
consume echo "cZdWvCi3SZ" succeeded! sub_id 1 store_id 1 queue_id 72 item_uin 0
consume echo "3V0CzqjvvV" succeeded! sub_id 1 store_id 1 queue_id 64 item_uin 0
consume echo "BnA6CDEN6j" succeeded! sub_id 1 store_id 1 queue_id 36 item_uin 0
consume echo "XpaVZfuI7w" succeeded! sub_id 1 store_id 1 queue_id 96 item_uin 0

consumer_main1
bin/consumer_main -c etc/consumer_server.1.conf
consume echo "0lcDFvKgAJ" succeeded! sub_id 1 store_id 1 queue_id 24 item_uin 0

consumer_main2
cd tiny/queue/phxqueue/
[root@iZuf6hx16c6lym9fxx51xcZ phxqueue]# bin/consumer_main -c etc/consumer_server.2.conf

安装时候报错,有遇到的嘛?是怎么解决的?谢谢

在编译时,我升级了gcc版本到7.2.
make[1]: Leaving directory `/phxqueue/phxqueue/phxqueue_phxrpc/app/scheduler'
g++ -std=c++11 -Wall -D_REENTRANT -D_GNU_SOURCE -D_XOPEN_SOURCE -fPIC -m64 -I/phxqueue/phxqueue//third_party/protobuf/include -O2 -I/phxqueue/phxqueue/ -I/phxqueue/phxqueue//third_party/glog/include -I/phxqueue/phxqueue//third_party/protobuf/include -I/phxqueue/phxqueue//third_party/leveldb/include -I/phxqueue/phxqueue//third_party/colib -I/phxqueue/phxqueue//third_party/phxpaxos/include -I/phxqueue/phxqueue//third_party/phxpaxos/include -I/phxqueue/phxqueue//third_party/phxrpc -I/phxqueue/phxqueue//third_party/phxpaxos/plugin/include -I/phxqueue/phxqueue//phxqueue_phxrpc/test/ -c phxqueue_phxrpc/test/test_rpc_config.cpp -o phxqueue_phxrpc/test/test_rpc_config.o
In file included from /phxqueue/phxqueue/phxqueue/comm.h:16:0,
from /phxqueue/phxqueue/phxqueue/./config/baseconfig.h:18,
from /phxqueue/phxqueue/phxqueue/config.h:15,
from /phxqueue/phxqueue/phxqueue/test/test_config.h:15,
from phxqueue_phxrpc/test/test_rpc_config.h:15,
from phxqueue_phxrpc/test/test_rpc_config.cpp:13:
/phxqueue/phxqueue/phxqueue/./comm/logger.h:48:22: error: ‘function’ in namespace ‘std’ does not name a template type
using LogFunc = std::function<void (const int, const char *, va_list)>;
^~~~~~~~
/phxqueue/phxqueue/phxqueue/./comm/logger.h:67:21: error: ‘LogFunc’ has not been declared
void SetLogFunc(LogFunc log_func);
^~~~~~~
In file included from /phxqueue/phxqueue/phxqueue/comm.h:18:0,
from /phxqueue/phxqueue/phxqueue/./config/baseconfig.h:18,
from /phxqueue/phxqueue/phxqueue/config.h:15,
from /phxqueue/phxqueue/phxqueue/test/test_config.h:15,
from phxqueue_phxrpc/test/test_rpc_config.h:15,
from phxqueue_phxrpc/test/test_rpc_config.cpp:13:
/phxqueue/phxqueue/phxqueue/./comm/masterclient.h:64:30: error: ‘std::function’ has not been declared
std::function<RetCode (Req &, Resp &)> rpc_func) {
^~~~~~~~
/phxqueue/phxqueue/phxqueue/./comm/masterclient.h:64:38: error: expected ‘,’ or ‘...’ before ‘<’ token
std::function<RetCode (Req &, Resp &)> rpc_func) {
^
In file included from /phxqueue/phxqueue/phxqueue/plugin.h:17:0,
from /phxqueue/phxqueue/phxqueue_phxrpc/./plugin/configfactory.h:18,
from /phxqueue/phxqueue/phxqueue_phxrpc/plugin.h:15,
from phxqueue_phxrpc/test/test_rpc_config.cpp:18:
/phxqueue/phxqueue/phxqueue/./plugin/logger_google.h:29:60: error: ‘phxqueue::comm::LogFunc’ has not been declared
const int google_log_level, comm::LogFunc &log_func);
^~~~~~~
In file included from /phxqueue/phxqueue/phxqueue/plugin.h:18:0,
from /phxqueue/phxqueue/phxqueue_phxrpc/./plugin/configfactory.h:18,
from /phxqueue/phxqueue/phxqueue_phxrpc/plugin.h:15,
from phxqueue_phxrpc/test/test_rpc_config.cpp:18:
/phxqueue/phxqueue/phxqueue/./plugin/logger_sys.h:28:111: error: ‘phxqueue::comm::LogFunc’ has not been declared
static int GetLogger(const std::string &module_name, const int sys_log_level, const bool daemonize, comm::LogFunc &pLogFunc);
^~~~~~~
make: *** [phxqueue_phxrpc/test/test_rpc_config.o] Error 1

make的时候报错,请问怎么回事

phxqueue_phxrpc/test/consumer_main.cpp: In function 'int main(int, char**)':
phxqueue_phxrpc/test/consumer_main.cpp:60:28: error: 'phxrpc::ServerUtils' has not been declared
if (daemonize) phxrpc::ServerUtils::Daemonize();
^
phxqueue_phxrpc/test/consumer_main.cpp:109:5: error: 'closelog' is not a member of 'phxrpc'
phxrpc::closelog();
^
phxqueue_phxrpc/test/consumer_main.cpp:109:5: note: suggested alternative:
In file included from /usr/include/syslog.h:1:0,
from /home/ubuntu/phxqueue//third_party/phxrpc/phxrpc/file/log_utils.h:25,
from /home/ubuntu/phxqueue//third_party/phxrpc/phxrpc/file.h:25,
from /home/ubuntu/phxqueue//third_party/phxrpc/phxrpc/rpc/server_config.h:26,
from /home/ubuntu/phxqueue//third_party/phxrpc/phxrpc/rpc.h:25,
from phxqueue_phxrpc/test/consumer_main.cpp:26:
/usr/include/x86_64-linux-gnu/sys/syslog.h:175:13: note: 'closelog'
extern void closelog (void);
^
make: *** [phxqueue_phxrpc/test/consumer_main.o] Error 1

phxqueue docker tag was empty.

Issue as my title.
phxqueue docker tag was empty.

I executed cmd docker pull phxqueue/phxqueue with an issue: manifest for phxqueue/phxqueue:latest not found.

Then i check the tag of phxqueue, its empty...... :(

build failover on MacOS sierra

first issue is compile glog failed, I mannually update glog to 0.3.4 fixed,
but then encounter following error:

ar -cvq /Users/xxx/phxqueue/third_party/phxpaxos/.lib/libphxpaxos.a
ar: no archive members specified

关于paox group的问题

Hello,

问题基于微信开源PhxQueue:高可用、高可靠、高性能的分布式队列

  1. 物理文件存储颗粒度是paxos group,而一个paxos group有多个Queue。这是不是意味着一个文件里有多个Queue的信息?Consumer在读取单个Queue时,是否需要读取Queue所在的paxos group的其他Queue的Message再做filtering还是你们采取了特殊的文件格式(不仅仅是append write)?
  2. 文中提到多DC部署下,单个paxos group的TPS理论最高250。这是不是意味着在多DC部署下,单个Queue的TPS理论最高为250?PhxQueue应该是只能保证单个Queue里Message的顺序,对于一些逻辑250的TPS是不是不够用?

谢谢!

执行 bash build.sh 时报错

libtool: compile: g++ -DHAVE_CONFIG_H -I. -I./src -I./src -Wall -Wwrite-strings -Woverloaded-virtual -Wno-sign-compare -DNO_FRAME_POINTER -DNDEBUG -fPIC -MT libglog_la-demangle.lo -MD -MP -MF .deps/libglog_la-demangle.Tpo -c src/demangle.cc -fPIC -DPIC -o .libs/libglog_la-demangle.o
src/demangle.cc: In function 'bool google::AtLeastNumCharsRemaining(const char*, int)':
src/demangle.cc:170:16: error: ISO C++ forbids comparison between pointer and integer [-fpermissive]
if (str == '\0') {
^~~~
src/demangle.cc: In function 'bool google::ParseCharClass(google::State*, const char*)':
src/demangle.cc:226:29: error: ISO C++ forbids comparison between pointer and integer [-fpermissive]
if (state->mangled_cur == '\0') {
^~~~
make: *** [libglog_la-demangle.lo] Error 1
glog install fail. please check compile error info.

程序运行经常出现core: std::system_error

Hi,运行过程中碰到以下几个问题:
1,程序运行有概率出现core:
root@survey phxqueue [master] $ bin/store_main -c etc/store_server.0.conf
terminate called after throwing an instance of 'std::system_error'
what():
已放弃 (core dumped)
root@survey phxqueue [master] $ bin/store_main -c etc/store_server.0.conf
server already started, 1 io threads 10 workers
listen succ, ip 127.0.0.1 port 5100

root@survey phxqueue [master] $ bin/store_main -c etc/store_server.1.conf
server already started, 1 io threads 10 workers
listen succ, ip 127.0.0.1 port 5200
terminate called after throwing an instance of 'std::system_error'
what():
已放弃 (core dumped)

2,集群模式,即使启动三个store也会出现paxos_ret 404
E0918 11:27:23.969184 12420 logger_google.cpp:79] ERR: PN8phxqueue4lock16KeepMasterThreadE::KeepMaster:257 MASTERSTAT: ERR: Propose err. paxos_ret 404 paxos_group_id 30 buf.length 20
I0918 11:27:23.969236 12420 logger_google.cpp:73] INFO: PN8phxqueue4lock16KeepMasterThreadE::KeepMaster:238 MASTERSTAT: begin try to get back master. paxos_group_id 33 master_rate 100
I0918 11:27:24.850062 12419 logger_google.cpp:73] INFO: PN8phxqueue4lock11CleanThreadE::DoRun:120 now 1180022589 nr_key 0 nr_clean_key 0
E0918 11:27:25.258280 12423 logger_google.cpp:79] ERR: PN8phxqueue4lock4LockE::GetLockInfo:269 paxos_group 34 lock "1000-1-1-0" CheckMaster err 10105
E0918 11:27:25.258319 12423 logger_google.cpp:79] ERR: P15LockServiceImpl::GetLockInfo:42 Lock GetLockInfo err 10105
E0918 11:27:25.469329 12420 logger_google.cpp:79] ERR: PN8phxqueue4lock16KeepMasterThreadE::KeepMaster:257 MASTERSTAT: ERR: Propose err. paxos_ret 404 paxos_group_id 33 buf.length 20
I0918 11:27:25.469388 12420 logger_google.cpp:73] INFO: PN8phxqueue4lock16KeepMasterThreadE::KeepMaster:238 MASTERSTAT: begin try to get back master. paxos_group_id 36 master_rate 100
I0918 11:27:25.850183 12419 logger_google.cpp:73] INFO: PN8phxqueue4lock11CleanThreadE::DoRun:120 now 1180023589 nr_key 0 nr_clean_key 0
I0918 11:27:26.850322 12419 logger_google.cpp:73] INFO: PN8phxqueue4lock11CleanThreadE::DoRun:120 now 1180024589 nr_key 0 nr_clean_key 0
E0918 11:27:26.969480 12420 logger_google.cpp:79] ERR: PN8phxqueue4lock16KeepMasterThreadE::KeepMaster:257 MASTERSTAT: ERR: Propose err. paxos_ret 404 paxos_group_id 36 buf.length 20
I0918 11:27:26.969521 12420 logger_google.cpp:73] INFO: PN8phxqueue4lock16KeepMasterThreadE::KeepMaster:238 MASTERSTAT: begin try to get back master. paxos_group_id 39 master_rate 100
E0918 11:27:27.258903 12424 logger_google.cpp:79] ERR: PN8phxqueue4lock4LockE::GetLockInfo:269 paxos_group 34 lock "1000-1-1-0" CheckMaster err 10105
E0918 11:27:27.258942 12424 logger_google.cpp:79] ERR: P15LockServiceImpl::GetLockInfo:42 Lock GetLockInfo err 10105
I0918 11:27:27.850438 12419 logger_google.cpp:73] INFO: PN8phxqueue4lock11CleanThreadE::DoRun:120 now 1180025589 nr_key 0 nr_clean_key 0

3,启动三个store服务之后,lock[0-1]能正常启动,lock2服务无法启动,日志如下:
0918 11:30:01.655338 13173 logger_google.cpp:79] ERR: PN8phxqueue4lock7LockMgrE::Init:111 topic_id 1000 paxos_group_id 24 ReadCheckpoint not exist
E0918 11:30:01.655375 13173 logger_google.cpp:79] ERR: PN8phxqueue4lock7LockMgrE::ReadRestartCheckpoint:240 topic_id 1000 paxos_group_id 24 not exist
E0918 11:30:01.655387 13173 logger_google.cpp:79] ERR: PN8phxqueue4lock7LockMgrE::Init:127 topic_id 1000 paxos_group_id 24 ReadRestartCheckpoint not exist
I0918 11:30:01.655407 13173 logger_google.cpp:73] INFO: Init:134 topic_id 1000 paxos_group_id 24 cp 18446744073709551615 restart_cp 18446744073709551615
E0918 11:30:04.663666 13173 logger_google.cpp:79] ERR: PN8phxqueue4lock7LockMgrE::ReadCheckpoint:188 topic_id 1000 paxos_group_id 25 not exist
E0918 11:30:04.663712 13173 logger_google.cpp:79] ERR: PN8phxqueue4lock7LockMgrE::Init:111 topic_id 1000 paxos_group_id 25 ReadCheckpoint not exist
E0918 11:30:04.663734 13173 logger_google.cpp:79] ERR: PN8phxqueue4lock7LockMgrE::ReadRestartCheckpoint:240 topic_id 1000 paxos_group_id 25 not exist
E0918 11:30:04.663753 13173 logger_google.cpp:79] ERR: PN8phxqueue4lock7LockMgrE::Init:127 topic_id 1000 paxos_group_id 25 ReadRestartCheckpoint not exist
I0918 11:30:04.663794 13173 logger_google.cpp:73] INFO: Init:134 topic_id 1000 paxos_group_id 25 cp 18446744073709551615 restart_cp 18446744073709551615
E0918 11:30:04.817548 13173 logger_google.cpp:79] ERR: PN8phxqueue4lock7LockMgrE::ReadCheckpoint:188 topic_id 1000 paxos_group_id 26 not exist
E0918 11:30:04.817597 13173 logger_google.cpp:79] ERR: PN8phxqueue4lock7LockMgrE::Init:111 topic_id 1000 paxos_group_id 26 ReadCheckpoint not exist
E0918 11:30:04.817615 13173 logger_google.cpp:79] ERR: PN8phxqueue4lock7LockMgrE::ReadRestartCheckpoint:240 topic_id 1000 paxos_group_id 26 not exist
E0918 11:30:04.817625 13173 logger_google.cpp:79] ERR: PN8phxqueue4lock7LockMgrE::Init:127 topic_id 1000 paxos_group_id 26 ReadRestartCheckpoint not exist
I0918 11:30:04.817636 13173 logger_google.cpp:73] INFO: Init:134 topic_id 1000 paxos_group_id 26 cp 18446744073709551615 restart_cp 18446744073709551615
E0918 11:30:05.590431 13173 logger_google.cpp:79] ERR: PN8phxqueue4lock7LockMgrE::ReadCheckpoint:188 topic_id 1000 paxos_group_id 27 not exist
E0918 11:30:05.590472 13173 logger_google.cpp:79] ERR: PN8phxqueue4lock7LockMgrE::Init:111 topic_id 1000 paxos_group_id 27 ReadCheckpoint not exist
E0918 11:30:05.590493 13173 logger_google.cpp:79] ERR: PN8phxqueue4lock7LockMgrE::ReadRestartCheckpoint:240 topic_id 1000 paxos_group_id 27 not exist
E0918 11:30:05.590504 13173 logger_google.cpp:79] ERR: PN8phxqueue4lock7LockMgrE::Init:127 topic_id 1000 paxos_group_id 27 ReadRestartCheckpoint not exist
I0918 11:30:05.590515 13173 logger_google.cpp:73] INFO: Init:134 topic_id 1000 paxos_group_id 27 cp 18446744073709551615 restart_cp 18446744073709551615
E0918 11:30:06.642740 13173 logger_google.cpp:79] ERR: PN8phxqueue4lock7LockMgrE::ReadCheckpoint:188 topic_id 1000 paxos_group_id 28 not exist
E0918 11:30:06.642802 13173 logger_google.cpp:79] ERR: PN8phxqueue4lock7LockMgrE::Init:111 topic_id 1000 paxos_group_id 28 ReadCheckpoint not exist
E0918 11:30:06.642829 13173 logger_google.cpp:79] ERR: PN8phxqueue4lock7LockMgrE::ReadRestartCheckpoint:240 topic_id 1000 paxos_group_id 28 not exist
E0918 11:30:06.642849 13173 logger_google.cpp:79] ERR: PN8phxqueue4lock7LockMgrE::Init:127 topic_id 1000 paxos_group_id 28 ReadRestartCheckpoint not exist
I0918 11:30:06.642868 13173 logger_google.cpp:73] INFO: Init:134 topic_id 1000 paxos_group_id 28 cp 18446744073709551615 restart_cp 18446744073709551615

还有就是几个服务运行在一个docker中,确实非常占资源,机械硬盘的机器上负载瞬间飙到300+

make[1]: *** 没有规则可以创建“test_propose_batch”需要的目标“test_propose_batch.o”。 停止

操作
cd $PHXPAXOS_DIR
./autoinstall.sh && make -j $kernal_num && make install

报错
error: make[1]: *** 没有规则可以创建“test_propose_batch”需要的目标“test_propose_batch.o”。 停止。

我的临时解决方法
查看makefile,干掉这个文件的编译。

我的疑问
这是bug么?如果不是,那么有没有更好的解决方法?
如果是,什么时候修复呢?

有计划支持C#的时间表吗

微软已经开源.NET 版本叫.NET Core ,支持跨平台,支持Linux, 这个消息队列非常强悍,看目前仅支持C++,期待能有支持C# 等其他语言。

消息传递语义

您好,对phxqueue的介绍是at-least-once的消息传递语义,且用于业务型消息中间件,那么这个语义是否足够呢?对于支付这样的场景,不应该是exactly-once才可以吗?我个人理解是,phxqueue可靠性做得好,可以保证到达系统的消息完全可靠,只是目前没有对消费状态进行记录,业务应用使用时需要自己维护消费状态来达到exactly-once。请问是这样的吗?

dependency not complete

build by build.sh

in phxqueue/makefile.mk should link gflags or it will report
more undefined references to `google::FlagRegisterer::FlagRegisterer(char const*, char const*, char const*, char const*, void*, void*)'
and ohter missing link as
table_builder.cc:(.text+0x8ff):对‘snappy::MaxCompressedLength(unsigned long)’未定义的引用
table_builder.cc:(.text+0x93d):对‘snappy::RawCompress(char const*, unsigned long, char*, unsigned long*)’未定义的引用
/data/wangxiaolei-s/3rd/phxqueue//third_party/leveldb/lib/libleveldb.a(format.o):在函数‘leveldb::ReadBlock(leveldb::RandomAccessFile*, leveldb::ReadOptions const&, leveldb::BlockHandle const&, leveldb::BlockContents*)’中:
format.cc:(.text+0x4d9):对‘snappy::GetUncompressedLength(char const*, unsigned long, unsigned long*)’未定义的引用
format.cc:(.text+0x5ae):对‘snappy::RawUncompress(char const*, unsigned long, char*)’未定义的引用

make报错 error: 'BaseDispatcher' in namespace 'phxrpc' does not name a template type

安装了gcc-7.2.0, 按照网址上的安装方法,到最后的 make 时报下面的错误,重新按照步骤进行也还是不行,麻烦有时间能帮忙看看吗?
g++ -std=c++11 -Wall -D_REENTRANT -D_GNU_SOURCE -D_XOPEN_SOURCE -fPIC -m64 -I/DATA/fengye/phxqueue-master//third_party/protobuf/include -O2 -o phxrpc_store_dispatcher.o -c phxrpc_store_dispatcher.cpp -I/DATA/fengye/phxqueue-master//third_party/glog/include -I/DATA/fengye/phxqueue-master//third_party/protobuf/include
-I/DATA/fengye/phxqueue-master//third_party/leveldb/include -I/DATA/fengye/phxqueue-master//third_party/colib -I/DATA/fengye/phxqueue-master//third_party/phxrpc
-I/DATA/fengye/phxqueue-master//third_party/phxpaxos/include -I/DATA/fengye/phxqueue-master//third_party/phxpaxos/plugin/include -I/DATA/fengye/phxqueue-master/
In file included from phxrpc_store_dispatcher.cpp:21:0:
phxrpc_store_dispatcher.h:32:26: error: 'BaseDispatcher' in namespace 'phxrpc' does not name a template type
static const phxrpc::BaseDispatcher::MqttFuncMap &GetMqttFuncMap();
^~~~~~~~~~~~~~
phxrpc_store_dispatcher.h:33:26: error: 'BaseDispatcher' in namespace 'phxrpc' does not name a template type
static const phxrpc::BaseDispatcher::URIFuncMap &GetURIFuncMap();
^~~~~~~~~~~~~~
phxrpc_store_dispatcher.h:39:31: error: 'BaseRequest' in namespace 'phxrpc' does not name a type
int PhxEcho(const phxrpc::BaseRequest const req,
^~~~~~~~~~~
phxrpc_store_dispatcher.h:40:25: error: 'phxrpc::BaseResponse' has not been declared
phxrpc::BaseResponse const resp);
^~~~~~~~~~~~
phxrpc_store_dispatcher.h:42:27: error: 'BaseRequest' in namespace 'phxrpc' does not name a type
int Add(const phxrpc::BaseRequest const req,
^~~~~~~~~~~
phxrpc_store_dispatcher.h:43:21: error: 'phxrpc::BaseResponse' has not been declared
phxrpc::BaseResponse const resp);
^~~~~~~~~~~~
phxrpc_store_dispatcher.h:45:27: error: 'BaseRequest' in namespace 'phxrpc' does not name a type
int Get(const phxrpc::BaseRequest const req,
^~~~~~~~~~~
phxrpc_store_dispatcher.h:46:21: error: 'phxrpc::BaseResponse' has not been declared
phxrpc::BaseResponse const resp);
^~~~~~~~~~~~
phxrpc_store_dispatcher.cpp:37:15: error: 'BaseDispatcher' in namespace 'phxrpc' does not name a template type
const phxrpc::BaseDispatcher::MqttFuncMap &
^~~~~~~~~~~~~~
phxrpc_store_dispatcher.cpp:43:15: error: 'BaseDispatcher' in namespace 'phxrpc' does not name a template type
const phxrpc::BaseDispatcher::URIFuncMap &
^~~~~~~~~~~~~~
phxrpc_store_dispatcher.cpp:52:44: error: 'BaseRequest' in namespace 'phxrpc' does not name a type
int StoreDispatcher::PhxEcho(const phxrpc::BaseRequest const req,
^~~~~~~~~~~
phxrpc_store_dispatcher.cpp:53:38: error: 'phxrpc::BaseResponse' has not been declared
phxrpc::BaseResponse const resp) {
^~~~~~~~~~~~
phxrpc_store_dispatcher.cpp: In member function 'int StoreDispatcher::PhxEcho(const int
, int
)':
phxrpc_store_dispatcher.cpp:63:42: error: request for member 'GetContent' in '
(const int
)req', which is of non-class type 'const int'
if (!req_pb.ParseFromString(req->GetContent())) {
^~~~~~~~~~
phxrpc_store_dispatcher.cpp:65:30: error: request for member 'GetContent' in '
(const int
)req', which is of non-class type 'const int'
req->GetContent().size(), req->GetClientIP());
^~~~~~~~~~
phxrpc_store_dispatcher.cpp:65:56: error: request for member 'GetClientIP' in '
(const int
)req', which is of non-class type 'const int'
req->GetContent().size(), req->GetClientIP());
^~~~~~~~~~~
phxrpc_store_dispatcher.cpp:77:48: error: request for member 'GetContent' in '(int)resp', which is of non-class type 'int'
if (!resp_pb.SerializeToString(&(resp->GetContent()))) {
^~~~~~~~~~
phxrpc_store_dispatcher.cpp:79:30: error: request for member 'GetClientIP' in '(const int)req', which is of non-class type 'const int'
req->GetClientIP());
^~~~~~~~~~~
phxrpc_store_dispatcher.cpp: At global scope:
phxrpc_store_dispatcher.cpp:89:40: error: 'BaseRequest' in namespace 'phxrpc' does not name a type
int StoreDispatcher::Add(const phxrpc::BaseRequest const req,
^~~~~~~~~~~
phxrpc_store_dispatcher.cpp:90:34: error: 'phxrpc::BaseResponse' has not been declared
phxrpc::BaseResponse const resp) {
^~~~~~~~~~~~
phxrpc_store_dispatcher.cpp: In member function 'int StoreDispatcher::Add(const int
, int
)':
phxrpc_store_dispatcher.cpp:100:42: error: request for member 'GetContent' in '(const int)req', which is of non-class type 'const int'
if (!req_pb.ParseFromString(req->GetContent())) {
^~~~~~~~~~
phxrpc_store_dispatcher.cpp:102:30: error: request for member 'GetContent' in '(const int)req', which is of non-class type 'const int'
req->GetContent().size(), req->GetClientIP());
^~~~~~~~~~
phxrpc_store_dispatcher.cpp:102:56: error: request for member 'GetClientIP' in '(const int)req', which is of non-class type 'const int'
req->GetContent().size(), req->GetClientIP());
^~~~~~~~~~~
phxrpc_store_dispatcher.cpp:114:48: error: request for member 'GetContent' in '(int)resp', which is of non-class type 'int'
if (!resp_pb.SerializeToString(&(resp->GetContent()))) {
^~~~~~~~~~
phxrpc_store_dispatcher.cpp:116:30: error: request for member 'GetClientIP' in '(const int)req', which is of non-class type 'const int'
req->GetClientIP());
^~~~~~~~~~~
phxrpc_store_dispatcher.cpp: At global scope:
phxrpc_store_dispatcher.cpp:126:40: error: 'BaseRequest' in namespace 'phxrpc' does not name a type
int StoreDispatcher::Get(const phxrpc::BaseRequest const req,
^~~~~~~~~~~
phxrpc_store_dispatcher.cpp:127:34: error: 'phxrpc::BaseResponse' has not been declared
phxrpc::BaseResponse const resp) {
^~~~~~~~~~~~
phxrpc_store_dispatcher.cpp: In member function 'int StoreDispatcher::Get(const int
, int
)':
phxrpc_store_dispatcher.cpp:137:42: error: request for member 'GetContent' in '(const int)req', which is of non-class type 'const int'
if (!req_pb.ParseFromString(req->GetContent())) {
^~~~~~~~~~
phxrpc_store_dispatcher.cpp:139:30: error: request for member 'GetContent' in '(const int)req', which is of non-class type 'const int'
req->GetContent().size(), req->GetClientIP());
^~~~~~~~~~
phxrpc_store_dispatcher.cpp:139:56: error: request for member 'GetClientIP' in '(const int)req', which is of non-class type 'const int'
req->GetContent().size(), req->GetClientIP());
^~~~~~~~~~~
phxrpc_store_dispatcher.cpp:151:48: error: request for member 'GetContent' in '(int)resp', which is of non-class type 'int'
if (!resp_pb.SerializeToString(&(resp->GetContent()))) {
^~~~~~~~~~
phxrpc_store_dispatcher.cpp:153:30: error: request for member 'GetClientIP' in '(const int)req', which is of non-class type 'const int'
req->GetClientIP());
^~~~~~~~~~~
make[1]: *** [phxrpc_store_dispatcher.o] Error 1
make[1]: Leaving directory `/DATA/fengye/phxqueue-master/phxqueue_phxrpc/app/store'
make: *** [phxqueue_phxrpc/app/store/store_main] Error 2

自动build失败

我按照github中的介绍,执行自动构建失败:

utilities.cc:(.text+0x820): undefined reference to _Ux86_64_getcontext' utilities.cc:(.text+0x839): undefined reference to _ULx86_64_init_local'
utilities.cc:(.text+0x892): undefined reference to _ULx86_64_get_reg' utilities.cc:(.text+0x8e2): undefined reference to _ULx86_64_step'
collect2: error: ld returned 1 exit status

<<微信开源PhxQueue:高可用、高可靠、高性能的分布式队列>>的几个问题

关于这篇文章,我有几个疑问:

1.测试结果表格里面的kafka同步刷盘和异步刷盘,我认为不准确,会误导用户,因为同步刷盘意味着log.flush.interval.messages=1.而我的理解是文中想表达的是kafka消息同步复制和消息异步复制,即acks=-1和acks=1。因为producer的send()已经是异步发送消息了。

2.关于一些配置数据,我想了解一下
(1)开启 Producer Batch,batch.size设置的多大?
(2)使用的是哪个版本的kafka? 0.11.0?
(3)基准测试中修改了哪些kafka的server默认配置参数?

3.ISR VS Paxos ,采用ISR而不采用Paxos,kafka当初设计时考虑到,如果采用Paxos,集群中如果容忍2个节点不可用,需要部署5个节点,而ISR只用部署3个节点
文章说同步延时取决于最慢的节点,这个我的理解是kafka从0.8.2版本引入min.insync.replicas,主要是平衡可靠性和吞吐量,根据文中的设置参数min.insync.replicas=2,意味着3个节点中只要有2个消息复制成功,就返回ack,而不是取决于最慢的节点

4.感觉这篇文章不是一个人写的,因为有些概念不一致,尤其是kafka的一些新特性是在不同版本中才具备的,文中出现不一致的情况,比如我提到的第三个问题

启动store_server 报错

bin/store_main -c etc/store_server.0.conf
ERR: MakeArgs ret -1

日志如下:
Log file created at: 2018/01/04 19:11:30
Running on machine: iZ8vbcyspccwhdj71zlryfZ
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E0104 19:11:30.279340 19092 logger_google.cpp:79] ^[[41;37m ERR: PN15phxqueue_phxrpc6config10BaseConfigIN8phxqueue6config12GlobalConfigEEE::LoadFile:36 FileConfig::Read fail. file ./etc/globalconfig.conf ^[[0m
E0104 19:11:30.279722 19092 logger_google.cpp:79] ^[[41;37m ERR: PN8phxqueue5store5StoreE::InitTopicID:79 GetTopicIDByTopicName ret -10102 topic test ^[[0m
E0104 19:11:30.279737 19092 logger_google.cpp:79] ^[[41;37m ERR: PN8phxqueue5store5StoreE::Init:144 InitTopicID ret -10102 topic test ^[[0m
E0104 19:11:30.279744 19092 logger_google.cpp:79] ^[[41;37m ERR: MakeArgs:66 init phxqueue store err ^[[0m

查看配置没有什么问题

glog对unwind依赖问题

编译过程中编译到test模块的时候发现以下问题
utilities.cc:(.text+0x867): undefined reference to _Ux86_64_getcontext' utilities.cc:(.text+0x880): undefined reference to _ULx86_64_init_local'
utilities.cc:(.text+0x8ec): undefined reference to _ULx86_64_get_reg' utilities.cc:(.text+0x949): undefined reference to _ULx86_64_step'

原因是glog中依赖了unwind,在makefile.mk的LDFLAGS加unwind解决了。

你的程序要求必须是ssd磁盘?我在普通机器上启动完成,iowait都70-90%

avg-cpu: %user %nice %system %iowait %steal %idle
1.01 0.00 1.13 72.36 0.00 25.50

[xdpp@tpd2 ~]$ ps -ef|grep store
xdpp 8816 1 4 02:01 ? 00:01:17 bin/store_main -c etc/store_server.0.conf -d
xdpp 8818 1 4 02:01 ? 00:01:16 bin/store_main -c etc/store_server.1.conf -d
xdpp 8820 1 2 02:01 ? 00:00:42 bin/store_main -c etc/store_server.2.conf -d
root 13015 1 0 Aug02 ? 00:00:00 /usr/libexec/tracker-store
xdpp 14411 13913 0 02:28 pts/1 00:00:00 grep --color=auto store

杀了进程就好了。
[xdpp@tpd2 ~]$ kill -9 8816 8818 8820
[xdpp@tpd2 ~]$ iostat 2 2
Linux 3.10.0-514.el7.x86_64 (tpd2) 09/17/2017 x86_64 (8 CPU)

avg-cpu: %user %nice %system %iowait %steal %idle
0.48 0.00 0.25 0.08 0.00 99.20

MQ客户端使用问题

有几个问题:

  1. 是否支持C++之外的语言?(微信的系统里,非C++的系统如何接入?)
  2. 生产者和消费者,是否需要额外编程才能够使用?(类似test里的SimpleConsumer和SimpleProducer)?
  3. 有无节点crash时,failover的时间的数据?(可以以同机房多个交换机下的网络环境为例)?
  4. 希望提供更多使用上的Demo(生产者/消费者 & PTP/TOPIC),方便大家更便利的进行测试(这是后续深入测试和研究的基础)。

辛苦不忙时予以解答。

test_store_main问题

test_producer_echo_main发送消息 ,消费者可能收到 ,为什么test_store_main不行,代码如下

test_store_main.cpp
int main(int argc, char ** argv) {
phxqueue::comm::LogFunc log_func;
phxqueue::plugin::LoggerGoogle::GetLogger("test_producer", "/tmp/phxqueue/log", 3, log_func);

const string global_config_path("./etc/globalconfig.conf");

phxqueue::plugin::ConfigFactory::SetConfigFactoryCreateFunc(
	[global_config_path]()->unique_ptr < phxqueue::plugin::ConfigFactory > {
	return unique_ptr<phxqueue::plugin::ConfigFactory>(
		new phxqueue_phxrpc::plugin::ConfigFactory(global_config_path));
});


phxqueue::producer::ProducerOption opt;
opt.log_func = log_func;

phxqueue::test::SimpleProducer producer(opt);
producer.Init();

string buf = "buffer";
//producer.Enqueue(1000, 123, 1, "buffer", 1);
phxqueue::comm::RetCode ret = producer.Enqueue(1000, 0, 3, buf, 1);
//phxqueue::comm::RetCode ret{producer->Enqueue(topic_id, uin, handle_id, buf, pub_id)};
if (phxqueue::comm::RetCode::RET_OK == ret) 
{
	printf("produce echo \"%s\" succeeded!\n", buf.c_str());
	fflush(stdout);
}
else {
	printf("produce echo \"%s\" failed return %d!\n", buf.c_str(), phxqueue::comm::as_integer(ret));
	fflush(stdout);
}

return 0;

}

test_producer_echo_main代码
int main(int argc, char **argv) {
const char *module_name{program_invocation_short_name};

if (argc < 1) {
    ShowUsage(module_name);
}

const string global_config_path("./etc/globalconfig.conf");

phxqueue::plugin::ConfigFactory::SetConfigFactoryCreateFunc(
        [global_config_path]()->unique_ptr<phxqueue::plugin::ConfigFactory> {
            return unique_ptr<phxqueue::plugin::ConfigFactory>(
                    new phxqueue_phxrpc::plugin::ConfigFactory(global_config_path));
        });

constexpr uint32_t len{10};
char random_str[len + 1]{'\0'};
GenRandomString(random_str, len);
string buf(random_str);

phxqueue::producer::ProducerOption opt;
unique_ptr<phxqueue::producer::Producer> producer;
producer.reset(new phxqueue_phxrpc::producer::Producer(opt));
producer->Init();

const int topic_id{1000};   //主题id
const uint64_t uin{0};      //业务层的key,相同的key入同一个队列 
const int handle_id{3};    //这个消息被订阅者接受后,那个上层回调函数处理
const int pub_id{1};       //topicinfo里的pubs段,这个段指定发布者和订阅者的关系

phxqueue::comm::RetCode ret{producer->Enqueue(topic_id, uin, handle_id, buf, pub_id)};

if (phxqueue::comm::RetCode::RET_OK == ret) {
    printf("produce echo \"%s\" succeeded!\n", buf.c_str());
    fflush(stdout);
} else {
    printf("produce echo \"%s\" failed return %d!\n", buf.c_str(), phxqueue::comm::as_integer(ret));
    fflush(stdout);
}

return 0;

}

2个代码我做了一下修改,没什么区别,可是test_store_main发送消息返回成功,但是消费者没有收到

bash build.sh 报错

  • autoreconf -f -i -Wall,no-obsolete
    ./autogen.sh: 43: ./autogen.sh: autoreconf: not found
    ./autoinstall.sh: line 122: ./configure: No such file or directory
    make: *** No targets specified and no makefile found. Stop.
    ./autoinstall.sh: line 84: cd: protobuf: No such file or directory
    protobuf install fail. please check compile error info.

store_main执行时报错。

[root@localhost phxqueue]# bin/store_main -c etc/store_server.0.conf
ERR: MakeArgs ret -1

版本如下:
[root@localhost phxqueue]# git log
commit c899588
Author: unixliang [email protected]
Date: Sat Sep 16 11:21:41 2017 +0000

set different shm_key to each consumer

queue_info_id的ranges和consumers关系问题

如下配置 ,ranges": ["0-3"] (这个好像会有4个队列0-4),,我开始的3个consumer,,,,
{
"consumers":
[
{
"addr":
{
"ip": "127.0.0.1",
"port": "8001"
},
"scale": "1000",
"sub_ids": [1]
},

   {
        "addr":
        {
            "ip": "127.0.0.1",
            "port": "8002"
        },
        "scale": "1000",
        "sub_ids": [1]
    },

    {
        "addr":
        {
            "ip": "127.0.0.1",
            "port": "8002"
        },
        "scale": "1000",
        "sub_ids": [1]
    }
]

}

{
"topic":
{
"topic_id": "1000",
"handle_ids": ["1", "2","3"],
"store_paxos_batch_count": "80",
"store_paxos_batch_delay_time_ms": "30"
},
"queue_infos":
[
{
"queue_info_id": "1",
"ranges": ["0-2"]
}
],
"pubs":
[
{
"pub_id": "1",
"sub_ids": ["1"],
"queue_info_ids": ["1"]
}
],
"subs":
[
{
"sub_id": "1",
"use_dynamic_scale": "0",
"skip_lock": "1"
}
]
}

看如下输出
[root@iZuf6hx16c6lym9fxx51xcZ phxqueue]# bin/consumer_main -c etc/consumer_server.0.conf

mytest consume echo "2J0w4ZrKeX" succeeded! sub_id 1 store_id 1 queue_id 3 item_uin 0
mytest consume echo "9eB7fb94CU" succeeded! sub_id 1 store_id 1 queue_id 3 item_uin 8
mytest consume echo "8i39WwoEkw" succeeded! sub_id 1 store_id 1 queue_id 0 item_uin 9
mytest consume echo "BbJIVJ3Yqt" succeeded! sub_id 1 store_id 1 queue_id 3 item_uin 11
mytest consume echo "WpMK6SnRtp" succeeded! sub_id 1 store_id 1 queue_id 0 item_uin 12
mytest consume echo "ioE7Bg456h" succeeded! sub_id 1 store_id 1 queue_id 3 item_uin 13
mytest consume echo "dKqiyFQ8n4" succeeded! sub_id 1 store_id 1 queue_id 3 item_uin 14
mytest consume echo "s38ipUniO6" succeeded! sub_id 1 store_id 1 queue_id 0 item_uin 15

[root@iZuf6hx16c6lym9fxx51xcZ phxqueue]# bin/consumer_main -c etc/consumer_server.1.conf

mytest consume echo "GCaOQhrNPS" succeeded! sub_id 1 store_id 1 queue_id 2 item_uin 1
mytest consume echo "dxNsISq4jT" succeeded! sub_id 1 store_id 1 queue_id 2 item_uin 2
mytest consume echo "NwvNzdHAWA" succeeded! sub_id 1 store_id 1 queue_id 1 item_uin 3
mytest consume echo "rSSiLJx6Js" succeeded! sub_id 1 store_id 1 queue_id 2 item_uin 4
mytest consume echo "HP9dWvvXYQ" succeeded! sub_id 1 store_id 1 queue_id 1 item_uin 5
mytest consume echo "SOnE40mUf0" succeeded! sub_id 1 store_id 1 queue_id 1 item_uin 6
mytest consume echo "xBEWX56qMe" succeeded! sub_id 1 store_id 1 queue_id 2 item_uin 7

[root@iZuf6hx16c6lym9fxx51xcZ phxqueue]# bin/consumer_main -c etc/consumer_server.2.conf

mytest consume echo "HZgJqPTrpf" succeeded! sub_id 1 store_id 1 queue_id 0 item_uin 10
mytest consume echo "WpMK6SnRtp" succeeded! sub_id 1 store_id 1 queue_id 0 item_uin 12
mytest consume echo "s38ipUniO6" succeeded! sub_id 1 store_id 1 queue_id 0 item_uin 15

1.有下面几个问题 consumer是怎么和队列对应的,还是各个consumer抢占拉取数据没有对应关系
2.为什么有些记录会被2个consumer消费,比如mytest consume echo "WpMK6SnRtp" succeeded! sub_id 1 store_id 1 queue_id 0 item_uin 12 ,,在consumer0和consumer2都有输出

编译时错误

/phxqueue-master/phxqueue/config/proto/topicconfig.pb.h:17:2: 错误:#error This file was generated by an older version of protoc which is incompatible with your Protocol Buffer headers. Please regenerate this file with a newer version of protoc.

Segmentation fault (core dumped)

编译完成
[root@localhost phxqueue-master]# ll ~/phxqueue-master/bin
total 81388
-rwxr-x--- 1 root root 5481431 Sep 13 19:24 consumer_main
-rwxr-x--- 1 root root 6164740 Sep 13 19:24 lock_main
-rwxr-x--- 1 root root 3599574 Sep 13 19:24 lock_tool_main
-rwxr-x--- 1 root root 5117934 Sep 13 19:24 producer_benchmark_main
-rwxr-x--- 1 root root 5360373 Sep 13 19:24 scheduler_main
-rwxr-x--- 1 root root 3596118 Sep 13 19:24 scheduler_tool_main
-rwxr-x--- 1 root root 6207847 Sep 13 19:24 store_main
-rwxr-x--- 1 root root 3599327 Sep 13 19:24 store_tool_main
-rwxr-x--- 1 root root 3831632 Sep 13 19:24 test_config_main
-rwxr-x--- 1 root root 4260253 Sep 13 19:24 test_consumer_main
-rwxr-x--- 1 root root 4087214 Sep 13 19:24 test_load_config_main
-rwxr-x--- 1 root root 4937975 Sep 13 19:24 test_lock_main
-rwxr-x--- 1 root root 187347 Sep 13 19:24 test_log_main
-rwxr-x--- 1 root root 143758 Sep 13 19:24 test_notifierpool_main
-rwxr-x--- 1 root root 3677555 Sep 13 19:24 test_plugin_main
-rwxr-x--- 1 root root 5106439 Sep 13 19:24 test_producer_echo_main
-rwxr-x--- 1 root root 4076620 Sep 13 19:24 test_producer_main
-rwxr-x--- 1 root root 4872753 Sep 13 19:24 test_rpc_config_main
-rwxr-x--- 1 root root 4014373 Sep 13 19:24 test_scheduler_main
-rwxr-x--- 1 root root 4978414 Sep 13 19:24 test_store_main

运行时报错
[root@localhost phxqueue-master]# bin/store_main -c etc/store_server.0.conf
Segmentation fault (core dumped)

没有日志生成
[root@localhost log]# ls /root/phxqueue-master/log/*
/root/phxqueue-master/log/clear_log.sh
/root/phxqueue-master/log/consumer.0:
/root/phxqueue-master/log/consumer.1:
/root/phxqueue-master/log/consumer.2:
/root/phxqueue-master/log/lock.0:
/root/phxqueue-master/log/lock.1:
/root/phxqueue-master/log/lock.2:
/root/phxqueue-master/log/scheduler.0:
/root/phxqueue-master/log/scheduler.1:
/root/phxqueue-master/log/scheduler.2:
/root/phxqueue-master/log/store.0:
/root/phxqueue-master/log/store.1:
/root/phxqueue-master/log/store.2:

gdb查看core文件
[root@localhost phxqueue-master]# gdb bin/store_main core.2953
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-92.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/...
Reading symbols from /root/phxqueue-master/bin/store_main...done.
[New Thread 2953]
Reading symbols from /lib64/librt.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/librt.so.1
Reading symbols from /lib64/libz.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libz.so.1
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /opt/gcc/gcc-4.8.2/lib64/libstdc++.so.6...done.
Loaded symbols for /opt/gcc/gcc-4.8.2/lib64/libstdc++.so.6
Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /opt/gcc/gcc-4.8.2/lib64/libgcc_s.so.1...done.
Loaded symbols for /opt/gcc/gcc-4.8.2/lib64/libgcc_s.so.1
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Core was generated by bin/store_main -c etc/store_server.0.conf'. Program terminated with signal 11, Segmentation fault. #0 0x00007fdd0e537c9c in vfprintf () from /lib64/libc.so.6 warning: File "/opt/gcc/gcc-4.8.2/lib64/libstdc++.so.6.0.18-gdb.py" auto-loading has been declined by your auto-load safe-path' set to "/usr/share/gdb/auto-load:/usr/lib/debug:/usr/bin/mono-gdb.py".
To enable execution of this file add
add-auto-load-safe-path /opt/gcc/gcc-4.8.2/lib64/libstdc++.so.6.0.18-gdb.py
line to your configuration file "/root/.gdbinit".
To completely disable this security protection add
set auto-load safe-path /
line to your configuration file "/root/.gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual. E.g., run from the shell:
info "(gdb)Auto-loading safe path"
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.209.el6_9.2.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0 0x00007fdd0e537c9c in vfprintf () from /lib64/libc.so.6
#1 0x00007fdd0e55f5c2 in vsnprintf () from /lib64/libc.so.6
#2 0x0000000000480b22 in phxqueue::plugin::LoggerGoogle::Log(int, char const*, __va_list_tag*) ()
#3 0x000000000048083d in phxqueue::comm::Logger::LogVerbose(char const*, ...) ()
#4 0x000000000048331f in phxpaxos::Logger::LogWarning(char const*, ...) ()
#5 0x000000000049b989 in phxpaxos::LogStore::RebuildIndex(phxpaxos::Database*, int&) ()
#6 0x000000000049bcbe in phxpaxos::LogStore::Init(std::basic_string<char, std::char_traits, std::allocator > const&, int, phxpaxos::Database*) ()
#7 0x0000000000497062 in phxpaxos::Database::Init(std::basic_string<char, std::char_traits, std::allocator > const&, int) ()
#8 0x0000000000499068 in phxpaxos::MultiDatabase::Init(std::basic_string<char, std::char_traits, std::allocator > const&, int) ()
#9 0x0000000000484832 in phxpaxos::PNode::InitLogStorage(phxpaxos::Options const&, phxpaxos::LogStorage*&) ()
#10 0x00000000004855f2 in phxpaxos::PNode::Init(phxpaxos::Options const&, phxpaxos::NetWork*&) ()
#11 0x00000000004813d8 in phxpaxos::Node::RunNode(phxpaxos::Options const&, phxpaxos::Node*&) ()
#12 0x00000000004223a4 in phxqueue::store::Store::PaxosInit() ()
#13 0x0000000000422b3b in phxqueue::store::Store::Init() ()
#14 0x000000000040b9de in main ()
(gdb)

[root@localhost lib64]# uname -a
Linux localhost 2.6.32-504.el6.x86_64 #1 SMP Tue Sep 16 01:56:35 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost lib64]# rpm -qa|grep glibc
glibc-common-2.12-1.209.el6_9.2.x86_64
glibc-devel-2.12-1.209.el6_9.2.x86_64
glibc-2.12-1.209.el6_9.2.x86_64
glibc-headers-2.12-1.209.el6_9.2.x86_64
glibc-devel-2.12-1.209.el6_9.2.i686
glibc-2.12-1.209.el6_9.2.i686

请帮忙分析一下原因 是c库函数版本的问题吗?

非常感谢~!

出入队严格有序问题

出入队严格有序是 比如生产者发送2个消息A ,B,,先发送A在发送B
但是网络或者其他原因导致B先到达store的队列里,这样那一个消息会被先消费了

【疑问】出入队列是否全局有序?

找不到提问社区,请允许我在这里提问.

【问题】
请问 "出入队严格有序" 这一个feature指的是 全局队列有序 还是指 单一队列(我的理解就是一个Store)的出入严格有序?如果是后者,要满足全量数据严格有序的话 producer 在入队列的时候就需要做数据倾斜(例如 通过key来约束)? 谢谢~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.