Giter VIP home page Giter VIP logo

Comments (29)

slrem avatar slrem commented on May 28, 2024

查看报错:
=== REDIS BUG REPORT START: Cut & paste starting from here ===
576:M 07 Jun 2022 18:53:26.024 # Redis 6.2.2~3 crashed by signal: 4, si_code: 2
576:M 07 Jun 2022 18:53:26.025 # Crashed running the instruction at: 0x94be49

------ STACK TRACE ------
EIP:
./redrock 0.0.0.0:4301 [cluster][0x94be49]

Backtrace:
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7fb60b586390]
./redrock 0.0.0.0:4301 [cluster][0x94be49]
./redrock 0.0.0.0:4301 cluster[0x95a9f9]
./redrock 0.0.0.0:4301 cluster[0x960938]
./redrock 0.0.0.0:4301 cluster[0x960b55]
./redrock 0.0.0.0:4301 cluster[0x92c2d3]
./redrock 0.0.0.0:4301 cluster[0x932d90]
./redrock 0.0.0.0:4301 cluster[0x80d2a3]
./redrock 0.0.0.0:4301 cluster[0x82ffa0]
./redrock 0.0.0.0:4301 cluster[0x71e4d3]
./redrock 0.0.0.0:4301 cluster[0x703f47]
./redrock 0.0.0.0:4301 cluster[0x6f9f60]
./redrock 0.0.0.0:4301 cluster[0x6eeb3d]
./redrock 0.0.0.0:4301 [cluster][0x6488b0]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7fb60b57c6ba]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fb60ad1a51d]

------ REGISTERS ------
576:M 07 Jun 2022 18:53:26.029 #
RAX:00000000642e9b12 RBX:00007fb6040296a0
RCX:00007fb6040472d0 RDX:0000000000000140
RDI:00007fb5ebffcf19 RSI:0000000000000006
RBP:00007fb5ebffbf30 RSP:00007fb5ebffbf00
R8 :00007fb5ebffc300 R9 :00007fb5ebffbf60
R10:00007fb5ebffbf50 R11:00007fb60ada8090
R12:0000000000000000 R13:0000000000000019
R14:00007fb5ebffc870 R15:000000000095ab00
RIP:000000000094be49 EFL:0000000000010202
CSGSFS:0000000000000033
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf0f) -> 0000000000000000
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf0e) -> 0000000000000000
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf0d) -> 0000000000000000
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf0c) -> 00007fb6040478d0
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf0b) -> 0000000000000000
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf0a) -> 0000000000000000
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf09) -> 000000000095a9f9
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf08) -> 00007fb5ebffbf90
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf07) -> 000000000095a9f9
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf06) -> 00007fb5ebffbf90
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf05) -> 00007fb5ebffbf50
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf04) -> 00007fb5ebffc150
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf03) -> 642e9b12ebffc150
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf02) -> 0000000000000006
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf01) -> 00007fb604046d02
576:M 07 Jun 2022 18:53:26.029 # (00007fb5ebffbf00) -> 0000000000000000

------ INFO OUTPUT ------

Server

redis_version:6.2.2~3
redis_git_sha1:e0d05944
redis_git_dirty:1
redis_build_id:7decd1451888d8bb
redis_mode:cluster
os:Linux 4.4.0-210-generic x86_64
arch_bits:64
multiplexing_api:epoll
atomicvar_api:c11-builtin
gcc_version:7.3.1
process_id:576
process_supervised:no
run_id:86ca2c59a0b55e54d528c6b622dc44482890eb2f
tcp_port:4301
server_time_usec:1654599206024399
uptime_in_seconds:292
uptime_in_days:0
hz:10
configured_hz:10
lru_clock:10432037
executable:/ulwork/redis4301/./redrock
config_file:/ulwork/redis4301/redis.conf
io_threads_active:0

Clients

connected_clients:1
cluster_connections:4
maxclients:10000
client_recent_max_input_buffer:0
client_recent_max_output_buffer:0
blocked_clients:0
tracking_clients:0
clients_in_timeout_table:0

Memory

used_memory:5585672
used_memory_human:5.33M
used_memory_rss:94072832
used_memory_rss_human:89.71M
used_memory_peak:185100944
used_memory_peak_human:176.53M
used_memory_peak_perc:3.02%
used_memory_overhead:3354784
used_memory_startup:1495984
used_memory_dataset:2230888
used_memory_dataset_perc:54.55%
allocator_allocated:5791640
allocator_active:10039296
allocator_resident:15921152
total_system_memory:8361390080
total_system_memory_human:7.79G
used_memory_lua:37888
used_memory_lua_human:37.00K
used_memory_scripts:0
used_memory_scripts_human:0B
number_of_cached_scripts:0
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
allocator_frag_ratio:1.73
allocator_frag_bytes:4247656
allocator_rss_ratio:1.59
allocator_rss_bytes:5881856
rss_overhead_ratio:5.91
rss_overhead_bytes:78151680
mem_fragmentation_ratio:17.04
mem_fragmentation_bytes:88551104
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_clients_slaves:0
mem_clients_normal:0
mem_aof_buffer:0
mem_allocator:jemalloc-5.1.0
active_defrag_running:0
lazyfree_pending_objects:0
lazyfreed_objects:0

Persistence

loading:0
current_cow_size:0
current_cow_size_age:0
current_fork_perc:0.00
current_save_keys_processed:0
current_save_keys_total:0
rdb_changes_since_last_save:0
rdb_bgsave_in_progress:0
rdb_last_save_time:1654598914
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:-1
rdb_current_bgsave_time_sec:-1
rdb_last_cow_size:0
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
aof_last_cow_size:0
module_fork_in_progress:0
module_fork_last_cow_size:0

Stats

total_connections_received:4
total_commands_processed:11
instantaneous_ops_per_sec:0
total_net_input_bytes:280
total_net_output_bytes:1256125
instantaneous_input_kbps:0.00
instantaneous_output_kbps:0.00
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:0
expired_stale_perc:0.00
expired_time_cap_reached_count:0
expire_cycle_cpu_milliseconds:2
evicted_keys:0
keyspace_hits:1
keyspace_misses:0
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:0
total_forks:0
migrate_cached_sockets:0
slave_expires_tracked_keys:0
active_defrag_hits:0
active_defrag_misses:0
active_defrag_key_hits:0
active_defrag_key_misses:0
tracking_total_keys:0
tracking_total_items:0
tracking_total_prefixes:0
unexpected_error_replies:0
total_error_replies:0
dump_payload_sanitizations:0
total_reads_processed:15
total_writes_processed:25
io_threaded_reads_processed:0
io_threaded_writes_processed:0

Replication

role:master
connected_slaves:0
master_failover_state:no-failover
master_replid:00ff4cb3adeeb04d8c234a9551b6dc8e387e6488
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:0
second_repl_offset:-1
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

CPU

used_cpu_sys:0.816000
used_cpu_user:2.784000
used_cpu_sys_children:0.000000
used_cpu_user_children:0.000000
used_cpu_sys_main_thread:0.000000
used_cpu_user_main_thread:0.004000

Modules

Commandstats

cmdstat_keys:calls=3,usec=36955,usec_per_call=12318.33,rejected_calls=0,failed_calls=0
cmdstat_auth:calls=4,usec=32,usec_per_call=8.00,rejected_calls=0,failed_calls=0
cmdstat_command:calls=3,usec=3067,usec_per_call=1022.33,rejected_calls=0,failed_calls=0
cmdstat_rockall:calls=1,usec=1452897,usec_per_call=1452897.00,rejected_calls=0,failed_calls=0

Errorstats

Cluster

cluster_enabled:1

Keyspace

db0:keys=33362,expires=0,avg_ttl=0

------ CLIENT LIST OUTPUT ------
id=6 addr=172.17.0.1:34536 laddr=172.17.0.10:4301 fd=23 name= age=1 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=40954 argv-mem=9 obl=0 oll=0 omem=0 tot-mem=61473 events=r cmd=hget user=default redir=-1

------ MODULES INFO OUTPUT ------

------ FAST MEMORY TEST ------
576:M 07 Jun 2022 18:53:26.030 # main thread terminated
576:M 07 Jun 2022 18:53:26.030 # Bio thread for job type #0 terminated
576:M 07 Jun 2022 18:53:26.030 # Bio thread for job type #1 terminated
576:M 07 Jun 2022 18:53:26.030 # Bio thread for job type #2 terminated

Fast memory test PASSED, however your memory can still be broken. Please run a memory test for several hours if possible.

------ DUMPING CODE AROUND EIP ------
Symbol: (null) (base: (nil))
Module: ./redrock 0.0.0.0:4301 [cluster] (base 0x400000)
$ xxd -r -p /tmp/dump.hex /tmp/dump.bin
$ objdump --adjust-vma=(nil) -D -b binary -m i386:x86-64 /tmp/dump.bin

from redrock.

szstonelee avatar szstonelee commented on May 28, 2024

请问集群中,三台都是redrock,还是只有一台是redrock,另外两台是redis?

from redrock.

szstonelee avatar szstonelee commented on May 28, 2024

麻烦试一下rockevict somkey,看这个命令是否也有问题?谢谢

from redrock.

szstonelee avatar szstonelee commented on May 28, 2024

另外,你的Linux的具体版本是什么?

from redrock.

szstonelee avatar szstonelee commented on May 28, 2024

还有,你不启用集群,用一个单独的redrock,执行rockall或rockevict,看有无问题?

因为从上面的log信息看,是redrock执行了机器不认识的机器指令(crashed by signal: 4),而这个机器指令应该是和RocksDB库有关(如果你是从网上直接下载的执行文件,它静态包含了RocksDB库)。如果单台(非集群)也是执行rockall或rockevict命令crash,我的推断是包含在静态文件里RocksDB库有本机器(可能是针对某个特定的Linux操作系统)不能识别的代码,那么麻烦:1)你告诉我操作系统的版本(我可以试一下);2)你用源码的方式编译再试(不过源码编译很复杂,请参考网上的源码编译说明:https://zhuanlan.zhihu.com/p/513026400)

from redrock.

slrem avatar slrem commented on May 28, 2024

补充下信息:
1.集群中3个master节点都是redrock,不过是用docker安装在同一台机器上的

2.Linux具体版本:Linux version 4.4.0-210-generic (buildd@lgw01-amd64-009) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.12) ) #242-Ubuntu SMP

3.使用rockevict 命令存盘一个或几个key,可以获取到值,使用rockalll后,,获取值时会crash

4.不启用集群,用一个单独的redrock,rockevict 命令存盘,可以获取到值,使用rockall后,获取值时会也会crash,使用rockall后,使用save也会crash

from redrock.

szstonelee avatar szstonelee commented on May 28, 2024

我试装了一下Ubuntu 16,用cat /etc/os-release,显示Linux版本如下:
NAME="Ubuntu"
VERSION="16.04.7 LTS (Xenial Xerus)"

然后运行redrock,再用rockall和save命令,get, hget命令读出,都没有发现问题

我现在的怀疑是:docker

你能否不在docker环境下做下测试,先从单机开始,再转到集群,谢谢。

from redrock.

szstonelee avatar szstonelee commented on May 28, 2024

另外,最好用sudo命令或root身份执行,因为需要对 /opt/redrock 目录有全读写权限(777)

from redrock.

slrem avatar slrem commented on May 28, 2024

我没用docker,直接在主机上安装redrock,rockall后可以获取值,但是save的时候还是crash了,如下:
image

from redrock.

slrem avatar slrem commented on May 28, 2024

报错日志如下:
image

from redrock.

szstonelee avatar szstonelee commented on May 28, 2024

非常奇怪,我这里在Ubuntu16下,也尝试注入33K hash key (每个field number都是5,value内容是2-2000字节随机),然后rockall,然后hget几个key的field,再做save。重复了几次,都没有出现这个问题。

我的建议:

  1. 你能否将整个过程复现给我,包括具体的操作过程
  2. 你试一下 bgsave,看是否有问题(和save一样,不过是后台存盘,可用lastsave检查结果,或者查看dump.rdb文件)

谢谢

from redrock.

slrem avatar slrem commented on May 28, 2024

新的情况,我发现我存入redrock的数据,dump.rdb如果大于18M(大概值),rockall之后hget会crash,如果存的数据量小dump.rdb文件 10M左右,rockall之后hget不会crash,不知道什么原因。
我机器内存大小也足够的
image

from redrock.

szstonelee avatar szstonelee commented on May 28, 2024

另外,中间执行一下rockstat,把结果也发给我看一下

from redrock.

szstonelee avatar szstonelee commented on May 28, 2024

对了,你的磁盘是本地的吗(还是云盘,我以前用云盘发现有问题,可能是云盘的实现和RocksDB有冲突)?是SSD还是HDD?

from redrock.

szstonelee avatar szstonelee commented on May 28, 2024

我这里的测试dump.rdb到了35M,没有发现问题

from redrock.

slrem avatar slrem commented on May 28, 2024

没有rockall之前:
rockstat结果:
image
rockall结果:
image
然后bgsave也crash
我是本地磁盘 是HDD

from redrock.

slrem avatar slrem commented on May 28, 2024

替换成edis服务就没问题的

from redrock.

szstonelee avatar szstonelee commented on May 28, 2024

我这里的测试环境是SSD(我从未在HDD上测试过),我查一下RocksDB对于HDD有什么特别要求

from redrock.

szstonelee avatar szstonelee commented on May 28, 2024

另外,你尝试一下不启用RDB备份,用AOF备份,方法是:启动参数 带入
./redrock --save 0 --appendonly yes

然后不要执行save和bgsave命令进行RDB存盘(所有的数据存盘会自动到appendonly.aof),仍用rockall命令存数据到RocksDB,然后hget看有无问题

重新启动redrock,只要本目录下有appendonly.aof备份,数据仍然自动恢复

from redrock.

szstonelee avatar szstonelee commented on May 28, 2024

对了,你不时查一下RocksDB目录的大小和里面是否存有.sst文件,

du -h -d 0 /opt/redrock

ls /opt/redrock/rocksdbXXX (XXX是你的监听端口)

(缺省大小redrock目录只有几兆,而且如果没有存数据,里面的子目录rocksdbXXX/,不会有 .sst文件)

我还有一个怀疑,是操作系统被配置为:根本不允许向/opt/redrock目录写数据,当数据比较少的时候,RocksDB只将数据存在内存里,不会有.sst文件落盘

from redrock.

slrem avatar slrem commented on May 28, 2024

有的
image

from redrock.

szstonelee avatar szstonelee commented on May 28, 2024

还有一个尝试,就是在机器上build你自己的 redrock和 shared library with RocksDB

因为我网上发布的redrock是用静态RocksDB库(网上的redrock包含了RocksDB本地build的所有机器代码),而且全是在我本机上build的(我的Build环境是CentOS 7,因为如果Ubuntu下Build,代码不能在CentOS下跑,反之则可以)

请参考:https://zhuanlan.zhihu.com/p/513026400

里面有Ubuntu下如何build的,我平时测试,都是用Ubuntu进行build的

from redrock.

slrem avatar slrem commented on May 28, 2024

我换了一台机器,单机使用redrock,rockall之后,save没有问题了,我不知道之前的那台机器到底有什么原因

from redrock.

szstonelee avatar szstonelee commented on May 28, 2024

我的解释是:

因为你用的redrock是我在我个人机器环境里编译的执行文件,虽然Linux尽可能做到兼容,但实际情况相当复杂,所以,可能前一台机器的硬件或操作系统和执行文件有冲突(因为你的报错信息是非法指令,操作系统不认识的机器代码,即crash by signal 4,并不是逻辑错,比如空指针、内存越界等),而新机器可能恰巧无冲突(我自己是尽可能编译的执行代码适合尽量多的Linux环境,也做了很多的尝试)。

最安全的模式仍是源码编译,特别是针对C/C++程序(这方面,Java要好很多,因为Java VM是和硬件和操作系统绑定的)。

恭喜你解决了问题,也谢谢你提供了这么多信息。

有机会在旧的出错机器上做下源码编译,看是否能解决问题。

from redrock.

slrem avatar slrem commented on May 28, 2024

好的,感谢耐心帮我解决问题!!

from redrock.

szstonelee avatar szstonelee commented on May 28, 2024

本周末我会再发一个新版本(版本4),那个rockall有一个显示输出的bug(只影响客户端输出)

from redrock.

szstonelee avatar szstonelee commented on May 28, 2024

You can try version 4 now.

from redrock.

slrem avatar slrem commented on May 28, 2024

重新下载了新的版本,也是一样的情况
image

from redrock.

szstonelee avatar szstonelee commented on May 28, 2024

之前换机器并成功的那个机器,也是如此吗?

原来的crash机器,应该继续crash,因为我怀疑是操作系统冲突问题,这个和版本4还是版本3,没有关系,除非你尝试用编译版本

from redrock.

Related Issues (8)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.