Comments (8)
what's the error or logs when the service is restarted?
from starrocks.
@kevincai
Thanks for your help!
Here is the logs from fe.log:
2024-05-10 03:53:35.621Z INFO (main|1) [StarRocksFE.start():130] StarRocks FE starting, version: 3.2.6-2585333
2024-05-10 03:53:35.629Z INFO (main|1) [FrontendOptions.initAddrUseIp():249] Use IP init local addr, IP: /172.26.0.4
2024-05-10 03:53:36.029Z INFO (main|1) [Auth.grantRoleInternal():837] grant operator to 'root'@'%', isReplay = true
2024-05-10 03:53:36.064Z INFO (main|1) [AuthorizationMgr.initBuiltinRoleUnlocked():283] create built-in role root[-1]
2024-05-10 03:53:36.072Z INFO (main|1) [AuthorizationMgr.initBuiltinRoleUnlocked():283] create built-in role db_admin[-2]
2024-05-10 03:53:36.074Z INFO (main|1) [AuthorizationMgr.initBuiltinRoleUnlocked():283] create built-in role cluster_admin[-3]
2024-05-10 03:53:36.075Z INFO (main|1) [AuthorizationMgr.initBuiltinRoleUnlocked():283] create built-in role user_admin[-4]
2024-05-10 03:53:36.076Z INFO (main|1) [AuthorizationMgr.initBuiltinRoleUnlocked():283] create built-in role public[-5]
2024-05-10 03:53:36.077Z INFO (main|1) [GlobalStateMgr.initAuth():1206] using new privilege framework..
2024-05-10 03:53:36.327Z INFO (main|1) [NodeMgr.getHelperNodes():694] get helper nodes: [172.26.0.4:9010]
2024-05-10 03:53:36.337Z INFO (main|1) [NodeMgr.getClusterIdAndRoleOnStartup():495] Current run_mode is shared_data
2024-05-10 03:53:36.337Z INFO (main|1) [NodeMgr.getClusterIdAndRoleOnStartup():502] Got cluster id: 1901908040, role: FOLLOWER, node name: 172.24.0.4_9010_1715312626878 and run_mode: shared_data
2024-05-10 03:53:36.340Z INFO (main|1) [BDBEnvironment.ensureHelperInLocal():336] skip check local environment because helper node and local node are identical.
2024-05-10 03:53:36.358Z INFO (main|1) [BDBEnvironment.setupEnvironment():266] start to setup bdb environment for 1 times
Here is the logs from cn.warning:
W0510 03:53:49.557263 1 cpu_info.cpp:190] /sys/devices/system/node is not present - no NUMA support
E0510 03:56:19.860203 1 daemon.cpp:241] got signal: Terminated from pid: 1117773120, is going to exit
Here is the logs from cn.info:
I0510 03:53:49.692610 1 daemon.cpp:275] Minidump is disabled
I0510 03:53:49.692624 1 starrocks_be.cpp:144] CN start step 1: daemon threads start successfully
I0510 03:53:49.692763 1 starrocks_be.cpp:148] CN start step 2: jdbc driver manager init successfully
I0510 03:53:49.692890 1 backend_options.cpp:77] localhost 172.26.0.5
I0510 03:53:49.692894 1 starrocks_be.cpp:154] CN start step 3: backend network options init successfully
I0510 03:53:49.692955 1 exec_env.cpp:275] Set storage page cache size 1331665920
I0510 03:53:49.693931 456 daemon.cpp:197] Current memory statistics: process(0), query_pool(0), load(0), metadata(0), compaction(0), schema_change(0), column_pool(0), page_cache(0), update(0), chunk_allocator(0), clone(0), consistency(0), datacache(0)
I0510 03:53:49.694003 1 starrocks_be.cpp:159] CN start step 4: global env init successfully
I0510 03:53:49.697417 458 data_dir.cpp:135] path: /opt/starrocks/cn/storage, hash: 753312587712824051
I0510 03:53:49.743515 534 data_dir.cpp:268] begin loading tablet from meta /opt/starrocks/cn/storage
I0510 03:53:49.743554 534 data_dir.cpp:322] load tablet from meta finished, loaded tablet: 0, error tablet: 0, path: /opt/starrocks/cn/storage duration: 0ms
I0510 03:53:49.743558 534 data_dir.cpp:351] begin loading rowset from meta /opt/starrocks/cn/storage
I0510 03:53:49.743564 534 data_dir.cpp:445] load rowset from meta finished, data dir: /opt/starrocks/cn/storage error/total: 0/0 duration: 0ms
I0510 03:53:49.744705 1 starrocks_be.cpp:162] CN start step 5: storage engine init successfully
I0510 03:53:49.749397 594 fragment_mgr.cpp:560] FragmentMgr cancel worker start working.
I0510 03:53:49.759133 1 exec_env.cpp:389] [PIPELINE] Exec thread pool: thread_num=12
I0510 03:53:49.800384 820 runtime_filter_worker.cpp:862] RuntimeFilterWorker start working.
I0510 03:53:49.800557 822 profile_report_worker.cpp:113] ProfileReportWorker start working.
I0510 03:53:49.800653 823 result_buffer_mgr.cpp:147] result buffer manager cancel thread begin.
I0510 03:53:49.801756 1 load_path_mgr.cpp:69] Load path configured to [/opt/starrocks/cn/storage/mini_download]
I0510 03:53:49.806778 1 starrocks_be.cpp:166] CN start step 6: exec engine init successfully
I0510 03:53:49.808176 892 compaction_manager.cpp:69] start compaction scheduler
I0510 03:53:49.808197 893 storage_engine.cpp:703] start to check compaction
I0510 03:53:49.808764 900 olap_server.cpp:890] begin to do tablet meta checkpoint:/opt/starrocks/cn/storage
I0510 03:53:49.808974 903 olap_server.cpp:817] try to perform path gc by tablet!
I0510 03:53:49.809051 904 olap_server.cpp:870] try to clear expired replication snapshots!
I0510 03:53:49.809089 1 olap_server.cpp:257] All backgroud threads of storage engine have started.
I0510 03:53:49.809098 1 starrocks_be.cpp:171] CN start step 7: storage engine start bg threads successfully
I0510 03:53:49.811769 1 starlet_server.cc:77] Starlet grpc server started on 0.0.0.0:9070
I0510 03:53:49.811854 1 starrocks_be.cpp:175] CN start step 8: staros worker init successfully
I0510 03:53:49.811939 1 starrocks_be.cpp:182] BE start step 9: datacache init successfully
I0510 03:53:49.811954 1 backend_base.cpp:78] StarRocksInternalService has started listening port on 9060
I0510 03:53:49.811992 914 starlet.cc:103] Empty starmanager address, skip reporting!
I0510 03:53:49.812615 1 thrift_server.cpp:380] BackendService has started listening port on 9060
I0510 03:53:49.812624 1 starrocks_be.cpp:197] CN start step 10: start thrift server successfully
I0510 03:53:49.816478 1 server.cpp:1069] Server[starrocks::LakeServiceImpl+starrocks::BackendInternalServiceImpl<starrocks::PInternalService>+starrocks::BackendInternalServiceImpl<doris::PBackendService>] is serving on port=8060.
I0510 03:53:49.816488 1 server.cpp:1072] Check out http://starrocks-cn:8060 in web browser.
I0510 03:53:49.816622 1 starrocks_be.cpp:231] CN start step 11: start brpc server successfully
I0510 03:53:49.862540 1 starrocks_be.cpp:240] CN start step 12: start http server successfully
I0510 03:53:49.867992 1 thrift_server.cpp:380] heartbeat has started listening port on 9050
I0510 03:53:49.868005 1 starrocks_be.cpp:259] CN start step 13: start heartbeat server successfully
I0510 03:53:49.868006 1 starrocks_be.cpp:261] CN started successfully
I0510 03:54:04.709030 456 daemon.cpp:197] Current memory statistics: process(104747312), query_pool(0), load(0), metadata(0), compaction(0), schema_change(0), column_pool(0), page_cache(0), update(0), chunk_allocator(0), clone(0), consistency(0), datacache(0)
I0510 03:54:19.725925 456 daemon.cpp:197] Current memory statistics: process(104747312), query_pool(0), load(0), metadata(0), compaction(0), schema_change(0), column_pool(0), page_cache(0), update(0), chunk_allocator(0), clone(0), consistency(0), datacache(0)
I0510 03:54:20.838241 893 storage_engine.cpp:706] 0 tablets checked. time elapse:31 seconds. compaction checker will be scheduled again in 1800 seconds
what's the error or logs when the service is restarted?
from starrocks.
only see
E0510 03:56:19.860203 1 daemon.cpp:241] got signal: Terminated from pid: 1117773120, is going to exit
as an exit signal, I guess this is from the compose restart?
what's the additional errors that prevents the fe/cn service up?
from starrocks.
one more thing, did you assign fixed IP address to fe and cn container? it will be easily get into error state if the ip changed after the containers are restarted.
from starrocks.
Accidentally selected the close option.
only see
E0510 03:56:19.860203 1 daemon.cpp:241] got signal: Terminated from pid: 1117773120, is going to exit
as an exit signal, I guess this is from the compose restart?
what's the additional errors that prevents the fe/cn service up?
Yes, what I do is trying to do docker compose down and docker compose up again.
But I didn't see additional logs from log file or docker logs command.
from starrocks.
check our example at https://github.com/StarRocks/demo/blob/master/deploy/docker-compose/docker-compose.yml
the key difference is to enable FQDN mode in docker compose yaml
...
- |
/opt/starrocks/fe_entrypoint.sh starrocks-fe-0
environment:
- HOST_TYPE=FQDN
...
Otherwise, you need to take care of the IP change everytime the containers get restarted.
FQDN info: https://docs.starrocks.io/docs/administration/management/enable_fqdn/
from starrocks.
check our example at https://github.com/StarRocks/demo/blob/master/deploy/docker-compose/docker-compose.yml
the key difference is to enable FQDN mode in docker compose yaml
... - | /opt/starrocks/fe_entrypoint.sh starrocks-fe-0 environment: - HOST_TYPE=FQDN ...
Otherwise, you need to take care of the IP change everytime the containers get restarted.
FQDN info: https://docs.starrocks.io/docs/administration/management/enable_fqdn/
@kevincai Many thanks! After enable FQDN and add fixed IP address, StarRocks work well by using docker compose up/down command. I guess the problem is related to IP change when I restart the service by docker compose.
Thanks for your help again!
from starrocks.
close the issue as completed
from starrocks.
Related Issues (20)
- print queryProfileStr into fe.audit.log HOT 1
- [Crash] CN crash when compaction
- 分页查询,offset不能计算
- [bug] statistic_auto_analyze_start_time parse failure
- How to solve FE FOLLOWER Alive: false issue HOT 2
- [Enhancement] Support Predicate Push Down in MaxCompute Catalog
- Need to add build arch info when showing FE/BE version HOT 2
- [Crash][shared-data] v3.2.6 upgrade to 3.3.0, cn compute node core HOT 3
- The HOT 1
- query overloads machine HOT 6
- Optimizer improvements HOT 1
- 使用routine load 如果kafka在正常选举过程中,FE会报错: fail to query watermark offset, err: Broker: Not leader for partition.
- [function] Support millisecond unix to datetime HOT 6
- Need a metrics starrocks_fe_routine_load_jobs{state="UNSTABLE"} to monitor the unstable RL jobs HOT 3
- "SHOW MATERIALIZED VIEWS" gives inconsistent result
- Weekly documentation feedback from readers
- support New skew join optimization
- 物化视图,数据源为远程mysql库里的视图,物化视图不更新数据,建议优化MODIFIED_TIME的取值判断
- Cannot access INFORMATION_SCHEMA.KEY_COLUMN_USAGE in Iceberg catalog when using capitalized string
- [Performance] v3.3 out of box performance comparison HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from starrocks.