Comments (29)
查看报错:
=== REDIS BUG REPORT START: Cut & paste starting from here ===
576:M 07 Jun 2022 18:53:26.024 # Redis 6.2.2~3 crashed by signal: 4, si_code: 2
576:M 07 Jun 2022 18:53:26.025 # Crashed running the instruction at: 0x94be49
------ STACK TRACE ------
EIP:
./redrock 0.0.0.0:4301 [cluster][0x94be49]
Backtrace:
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7fb60b586390]
./redrock 0.0.0.0:4301 [cluster][0x94be49]
./redrock 0.0.0.0:4301 cluster[0x95a9f9]
./redrock 0.0.0.0:4301 cluster[0x960938]
./redrock 0.0.0.0:4301 cluster[0x960b55]
./redrock 0.0.0.0:4301 cluster[0x92c2d3]
./redrock 0.0.0.0:4301 cluster[0x932d90]
./redrock 0.0.0.0:4301 cluster[0x80d2a3]
./redrock 0.0.0.0:4301 cluster[0x82ffa0]
./redrock 0.0.0.0:4301 cluster[0x71e4d3]
./redrock 0.0.0.0:4301 cluster[0x703f47]
./redrock 0.0.0.0:4301 cluster[0x6f9f60]
./redrock 0.0.0.0:4301 cluster[0x6eeb3d]
./redrock 0.0.0.0:4301 [cluster][0x6488b0]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7fb60b57c6ba]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fb60ad1a51d]
------ REGISTERS ------
576:M 07 Jun 2022 18:53:26.029 #
RAX:00000000642e9b12 RBX:00007fb6040296a0
RCX:00007fb6040472d0 RDX:0000000000000140
RDI:00007fb5ebffcf19 RSI:0000000000000006
RBP:00007fb5ebffbf30 RSP:00007fb5ebffbf00
R8 :00007fb5ebffc300 R9 :00007fb5ebffbf60
R10:00007fb5ebffbf50 R11:00007fb60ada8090
R12:0000000000000000 R13:0000000000000019
R14:00007fb5ebffc870 R15:000000000095ab00
RIP:000000000094be49 EFL:0000000000010202
CSGSFS:0000000000000033
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf0f) -> 0000000000000000
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf0e) -> 0000000000000000
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf0d) -> 0000000000000000
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf0c) -> 00007fb6040478d0
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf0b) -> 0000000000000000
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf0a) -> 0000000000000000
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf09) -> 000000000095a9f9
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf08) -> 00007fb5ebffbf90
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf07) -> 000000000095a9f9
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf06) -> 00007fb5ebffbf90
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf05) -> 00007fb5ebffbf50
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf04) -> 00007fb5ebffc150
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf03) -> 642e9b12ebffc150
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf02) -> 0000000000000006
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf01) -> 00007fb604046d02
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf00) -> 0000000000000000
------ INFO OUTPUT ------
Server
redis_version:6.2.2~3
redis_git_sha1:e0d05944
redis_git_dirty:1
redis_build_id:7decd1451888d8bb
redis_mode:cluster
os:Linux 4.4.0-210-generic x86_64
arch_bits:64
multiplexing_api:epoll
atomicvar_api:c11-builtin
gcc_version:7.3.1
process_id:576
process_supervised:no
run_id:86ca2c59a0b55e54d528c6b622dc44482890eb2f
tcp_port:4301
server_time_usec:1654599206024399
uptime_in_seconds:292
uptime_in_days:0
hz:10
configured_hz:10
lru_clock:10432037
executable:/ulwork/redis4301/./redrock
config_file:/ulwork/redis4301/redis.conf
io_threads_active:0
Clients
connected_clients:1
cluster_connections:4
maxclients:10000
client_recent_max_input_buffer:0
client_recent_max_output_buffer:0
blocked_clients:0
tracking_clients:0
clients_in_timeout_table:0
Memory
used_memory:5585672
used_memory_human:5.33M
used_memory_rss:94072832
used_memory_rss_human:89.71M
used_memory_peak:185100944
used_memory_peak_human:176.53M
used_memory_peak_perc:3.02%
used_memory_overhead:3354784
used_memory_startup:1495984
used_memory_dataset:2230888
used_memory_dataset_perc:54.55%
allocator_allocated:5791640
allocator_active:10039296
allocator_resident:15921152
total_system_memory:8361390080
total_system_memory_human:7.79G
used_memory_lua:37888
used_memory_lua_human:37.00K
used_memory_scripts:0
used_memory_scripts_human:0B
number_of_cached_scripts:0
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
allocator_frag_ratio:1.73
allocator_frag_bytes:4247656
allocator_rss_ratio:1.59
allocator_rss_bytes:5881856
rss_overhead_ratio:5.91
rss_overhead_bytes:78151680
mem_fragmentation_ratio:17.04
mem_fragmentation_bytes:88551104
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_clients_slaves:0
mem_clients_normal:0
mem_aof_buffer:0
mem_allocator:jemalloc-5.1.0
active_defrag_running:0
lazyfree_pending_objects:0
lazyfreed_objects:0
Persistence
loading:0
current_cow_size:0
current_cow_size_age:0
current_fork_perc:0.00
current_save_keys_processed:0
current_save_keys_total:0
rdb_changes_since_last_save:0
rdb_bgsave_in_progress:0
rdb_last_save_time:1654598914
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:-1
rdb_current_bgsave_time_sec:-1
rdb_last_cow_size:0
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
aof_last_cow_size:0
module_fork_in_progress:0
module_fork_last_cow_size:0
Stats
total_connections_received:4
total_commands_processed:11
instantaneous_ops_per_sec:0
total_net_input_bytes:280
total_net_output_bytes:1256125
instantaneous_input_kbps:0.00
instantaneous_output_kbps:0.00
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:0
expired_stale_perc:0.00
expired_time_cap_reached_count:0
expire_cycle_cpu_milliseconds:2
evicted_keys:0
keyspace_hits:1
keyspace_misses:0
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:0
total_forks:0
migrate_cached_sockets:0
slave_expires_tracked_keys:0
active_defrag_hits:0
active_defrag_misses:0
active_defrag_key_hits:0
active_defrag_key_misses:0
tracking_total_keys:0
tracking_total_items:0
tracking_total_prefixes:0
unexpected_error_replies:0
total_error_replies:0
dump_payload_sanitizations:0
total_reads_processed:15
total_writes_processed:25
io_threaded_reads_processed:0
io_threaded_writes_processed:0
Replication
role:master
connected_slaves:0
master_failover_state:no-failover
master_replid:00ff4cb3adeeb04d8c234a9551b6dc8e387e6488
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:0
second_repl_offset:-1
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
CPU
used_cpu_sys:0.816000
used_cpu_user:2.784000
used_cpu_sys_children:0.000000
used_cpu_user_children:0.000000
used_cpu_sys_main_thread:0.000000
used_cpu_user_main_thread:0.004000
Modules
Commandstats
cmdstat_keys:calls=3,usec=36955,usec_per_call=12318.33,rejected_calls=0,failed_calls=0
cmdstat_auth:calls=4,usec=32,usec_per_call=8.00,rejected_calls=0,failed_calls=0
cmdstat_command:calls=3,usec=3067,usec_per_call=1022.33,rejected_calls=0,failed_calls=0
cmdstat_rockall:calls=1,usec=1452897,usec_per_call=1452897.00,rejected_calls=0,failed_calls=0
Errorstats
Cluster
cluster_enabled:1
Keyspace
db0:keys=33362,expires=0,avg_ttl=0
------ CLIENT LIST OUTPUT ------
id=6 addr=172.17.0.1:34536 laddr=172.17.0.10:4301 fd=23 name= age=1 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=40954 argv-mem=9 obl=0 oll=0 omem=0 tot-mem=61473 events=r cmd=hget user=default redir=-1
------ MODULES INFO OUTPUT ------
------ FAST MEMORY TEST ------
576:M 07 Jun 2022 18:53:26.030 # main thread terminated
576:M 07 Jun 2022 18:53:26.030 # Bio thread for job type #0 terminated
576:M 07 Jun 2022 18:53:26.030 # Bio thread for job type #1 terminated
576:M 07 Jun 2022 18:53:26.030 # Bio thread for job type #2 terminated
Fast memory test PASSED, however your memory can still be broken. Please run a memory test for several hours if possible.
------ DUMPING CODE AROUND EIP ------
Symbol: (null) (base: (nil))
Module: ./redrock 0.0.0.0:4301 [cluster] (base 0x400000)
$ xxd -r -p /tmp/dump.hex /tmp/dump.bin
$ objdump --adjust-vma=(nil) -D -b binary -m i386:x86-64 /tmp/dump.bin
from redrock.
请问集群中,三台都是redrock,还是只有一台是redrock,另外两台是redis?
from redrock.
麻烦试一下rockevict somkey,看这个命令是否也有问题?谢谢
from redrock.
另外,你的Linux的具体版本是什么?
from redrock.
还有,你不启用集群,用一个单独的redrock,执行rockall或rockevict,看有无问题?
因为从上面的log信息看,是redrock执行了机器不认识的机器指令(crashed by signal: 4),而这个机器指令应该是和RocksDB库有关(如果你是从网上直接下载的执行文件,它静态包含了RocksDB库)。如果单台(非集群)也是执行rockall或rockevict命令crash,我的推断是包含在静态文件里RocksDB库有本机器(可能是针对某个特定的Linux操作系统)不能识别的代码,那么麻烦:1)你告诉我操作系统的版本(我可以试一下);2)你用源码的方式编译再试(不过源码编译很复杂,请参考网上的源码编译说明:https://zhuanlan.zhihu.com/p/513026400)
from redrock.
补充下信息:
1.集群中3个master节点都是redrock,不过是用docker安装在同一台机器上的
2.Linux具体版本:Linux version 4.4.0-210-generic (buildd@lgw01-amd64-009) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.12) ) #242-Ubuntu SMP
3.使用rockevict 命令存盘一个或几个key,可以获取到值,使用rockalll后,,获取值时会crash
4.不启用集群,用一个单独的redrock,rockevict 命令存盘,可以获取到值,使用rockall后,获取值时会也会crash,使用rockall后,使用save也会crash
from redrock.
我试装了一下Ubuntu 16,用cat /etc/os-release,显示Linux版本如下:
NAME="Ubuntu"
VERSION="16.04.7 LTS (Xenial Xerus)"
然后运行redrock,再用rockall和save命令,get, hget命令读出,都没有发现问题
我现在的怀疑是:docker
你能否不在docker环境下做下测试,先从单机开始,再转到集群,谢谢。
from redrock.
另外,最好用sudo命令或root身份执行,因为需要对 /opt/redrock 目录有全读写权限(777)
from redrock.
我没用docker,直接在主机上安装redrock,rockall后可以获取值,但是save的时候还是crash了,如下:
from redrock.
from redrock.
非常奇怪,我这里在Ubuntu16下,也尝试注入33K hash key (每个field number都是5,value内容是2-2000字节随机),然后rockall,然后hget几个key的field,再做save。重复了几次,都没有出现这个问题。
我的建议:
- 你能否将整个过程复现给我,包括具体的操作过程
- 你试一下 bgsave,看是否有问题(和save一样,不过是后台存盘,可用lastsave检查结果,或者查看dump.rdb文件)
谢谢
from redrock.
新的情况,我发现我存入redrock的数据,dump.rdb如果大于18M(大概值),rockall之后hget会crash,如果存的数据量小dump.rdb文件 10M左右,rockall之后hget不会crash,不知道什么原因。
我机器内存大小也足够的
from redrock.
另外,中间执行一下rockstat,把结果也发给我看一下
from redrock.
对了,你的磁盘是本地的吗(还是云盘,我以前用云盘发现有问题,可能是云盘的实现和RocksDB有冲突)?是SSD还是HDD?
from redrock.
我这里的测试dump.rdb到了35M,没有发现问题
from redrock.
没有rockall之前:
rockstat结果:
rockall结果:
然后bgsave也crash
我是本地磁盘 是HDD
from redrock.
替换成edis服务就没问题的
from redrock.
我这里的测试环境是SSD(我从未在HDD上测试过),我查一下RocksDB对于HDD有什么特别要求
from redrock.
另外,你尝试一下不启用RDB备份,用AOF备份,方法是:启动参数 带入
./redrock --save 0 --appendonly yes
然后不要执行save和bgsave命令进行RDB存盘(所有的数据存盘会自动到appendonly.aof),仍用rockall命令存数据到RocksDB,然后hget看有无问题
重新启动redrock,只要本目录下有appendonly.aof备份,数据仍然自动恢复
from redrock.
对了,你不时查一下RocksDB目录的大小和里面是否存有.sst文件,
du -h -d 0 /opt/redrock
ls /opt/redrock/rocksdbXXX (XXX是你的监听端口)
(缺省大小redrock目录只有几兆,而且如果没有存数据,里面的子目录rocksdbXXX/,不会有 .sst文件)
我还有一个怀疑,是操作系统被配置为:根本不允许向/opt/redrock目录写数据,当数据比较少的时候,RocksDB只将数据存在内存里,不会有.sst文件落盘
from redrock.
from redrock.
还有一个尝试,就是在机器上build你自己的 redrock和 shared library with RocksDB
因为我网上发布的redrock是用静态RocksDB库(网上的redrock包含了RocksDB本地build的所有机器代码),而且全是在我本机上build的(我的Build环境是CentOS 7,因为如果Ubuntu下Build,代码不能在CentOS下跑,反之则可以)
请参考:https://zhuanlan.zhihu.com/p/513026400
里面有Ubuntu下如何build的,我平时测试,都是用Ubuntu进行build的
from redrock.
我换了一台机器,单机使用redrock,rockall之后,save没有问题了,我不知道之前的那台机器到底有什么原因
from redrock.
我的解释是:
因为你用的redrock是我在我个人机器环境里编译的执行文件,虽然Linux尽可能做到兼容,但实际情况相当复杂,所以,可能前一台机器的硬件或操作系统和执行文件有冲突(因为你的报错信息是非法指令,操作系统不认识的机器代码,即crash by signal 4,并不是逻辑错,比如空指针、内存越界等),而新机器可能恰巧无冲突(我自己是尽可能编译的执行代码适合尽量多的Linux环境,也做了很多的尝试)。
最安全的模式仍是源码编译,特别是针对C/C++程序(这方面,Java要好很多,因为Java VM是和硬件和操作系统绑定的)。
恭喜你解决了问题,也谢谢你提供了这么多信息。
有机会在旧的出错机器上做下源码编译,看是否能解决问题。
from redrock.
好的,感谢耐心帮我解决问题!!
from redrock.
本周末我会再发一个新版本(版本4),那个rockall有一个显示输出的bug(只影响客户端输出)
from redrock.
You can try version 4 now.
from redrock.
from redrock.
之前换机器并成功的那个机器,也是如此吗?
原来的crash机器,应该继续crash,因为我怀疑是操作系统冲突问题,这个和版本4还是版本3,没有关系,除非你尝试用编译版本
from redrock.
Related Issues (8)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from redrock.