Comments (5)
从日志看,在10:35:58 时间,proxy没有收到任何的请求,怀疑是网络问题导致的
from matrixone.
10:35:58时间较短,服务只有不到一秒的无法连接,或许可以看看2024-05-09 23:42:10到2024-05-09 23:42:12之间的情况。
https://shanghai.idc.matrixorigin.cn:30001/explore?panes=%7B%22DQF%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22mo-nightly-2eeef92-20240509232129%5C%22,%20matrixorigin_io_component%3D%5C%22ProxySet%5C%22%7D%20%7C%3D%20%60new%20connection%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%221715269330000%22,%22to%22:%221715269332000%22%7D%7D%7D&schemaVersion=1&orgId=1
from matrixone.
@guguducken please comment the conclusion and forward it back to @aressu1985
from matrixone.
根因:客户端 ipv4.tcp_tw_reuse 设置为2,只对本地回环地址的timewait socket进行重用,通过10.222.1.128创建短连接,激增时会导致无法创建
复现过程:
- 启动sysbench开始短连接测试
- 通过命令观察mo服务端口的可连通性
while true; do mysql -h 10.222.1.132 -udump -P30015 -p111 -e 'select 1;' > /dev/null;sleep 1; done
- 经过一段时间后,会产生报错
ERROR 2003 (HY000): Can't connect to MySQL server on '10.222.1.132:30015' (99)
排查过程:
- 尝试直接proxy pod,发现没有问题
- 修改service type,使用nodePort,测试后仍然有问题
- 修改服务端集群中节点(132,134)的内核参数,其中包括
ulimit -n 65535 sysctl net.ipv4.tcp_tw_reuse=1 sysctl net.ipv4.tcp_fin_timeout=5
- 在节点132,134上使用tcpdump抓包分析网络流量,检查是否是cni的问题
- 重启节点132,134
- 再次复现,通过
ss -s
观察到出问题是服务端timewait较少,但是客户端timewait数量激增 - 增加对照客户端,使用复现过程中测试可连通性的脚本测试mo集群的连通性,并持续观察客户端timewait数量情况
- 修改客户端内核参数
net.ipv4.tcp_tw_reuse=1
,并再次测试尝试复现 - 修改内核参数后测试不再报错
from matrixone.
fixed
from matrixone.
Related Issues (20)
- [Bug]: rolling-update CN behind proxy cause prepared stmt lost HOT 1
- [Bug]: force flush failed. HOT 3
- [Bug]: snapshot bvt may panic HOT 3
- [Bug]: restore account HOT 3
- [Bug]: restore account if exists fk table ref other databasee will report "can not drop database. It has been referenced by foreign keys" HOT 1
- [Tech Request]: rename restrict operator to filter
- [Tech Request]: optmize duplicate check memory usage for sql like insert into t1 selct from t2
- [Bug]: stock_level txn timeout in 60s in tpcc 10-10 longrunning during statbility test on distributed HOT 2
- [Bug]: [tke regression] tpcc 100-1000 test oom. HOT 6
- [Bug]: CI/multi cn e2e run bvt `load data LOCAL infile` got FileNotFoundException HOT 4
- [Bug]: UT TestKill got error `panic BUG: StartStatement called twice`
- [Tech Request]: support read from cache without copy
- [Tech Request]: add txn id in log
- [Bug]: restore fk table report can not find table by id in ci HOT 1
- [Bug]: Optimizer doesn't support ProjectList in SEMI Join
- [Bug]: restore single db multi table with fk report 'no such table'.
- [Bug]: panic during merge when object is empty
- [Bug]: `context deadline exceeded` error occurs
- [Bug]: New account used about 16GB storage HOT 1
- [Bug]: w-w conflict in MO Checkin Regression(Standalone BVT) / Multi-CN e2e BVT(Race) Test on Linux/x64
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from matrixone.