Comments (17)
使用changestream能力迁移,不用关balance
from mongoshake.
- balance不关闭,数据会在不同分片中进行转换
- 可以
- 不大,默认从备库上拉
from mongoshake.
1.数据块迁移,对同步有什么影响?源库会迁移,目标库也会迁移,会造成不正确吗?
2.自动从备库拉?如果主备切换,那么拉取的目标会变成另一个备库吗?
from mongoshake.
- 不是目标数据库,而是源库的问题。读取不同shards的时候数据发生迁移,这部分问题在mongoshake层面无法解决
- 会
from mongoshake.
1.发生迁移,mongoshake把迁移产生的oplog在目标重放不就好了吗?迁移走的delete掉,迁移来的insert,不明白会有什么问题,可以举个例子吗?
from mongoshake.
我看日志,貌似会连主库做同步啊!
172.17.160.241:27001这个是主,我把副本集所有节点都配置了,但是下面日志有这句“[2018/09/30 16:44:01 CST] [INFO] [dbpool.NewMongoConn:41] New session to mongodb://172.17.160.242:27001 successfully”,貌似是选定了主节点做同步了?
下面是日志信息:
[2018/09/30 16:43:58 CST] [INFO] [quorum.masterChanged:42] become the master and notify waiter
[2018/09/30 16:43:58 CST] [INFO] [dbpool.NewMongoConn:41] New session to mongodb://172.17.160.241:27001,172.17.160.241:27002,172.17.160.241:27003 successfully
[2018/09/30 16:43:58 CST] [INFO] [dbpool.(*MongoConn).Close:46] Close session with mongodb://172.17.160.241:27001,172.17.160.241:27002,172.17.160.241:27003
[2018/09/30 16:43:58 CST] [INFO] [collector.(*ReplicationCoordinator).Run:42] Collector startup. shard_by[collection] gids[]
[2018/09/30 16:43:58 CST] [INFO] [collector.(*ReplicationCoordinator).Run:46] Collector configuration {"MongoUrls":["mongodb://172.17.160.241:27001,172.17.160.241:27002,172.17.160.241:27003"],"CollectorId":"mongoshake2","CheckpointInterval":5000,"HTTPListenPort":9100,"SystemProfile":9200,"LogLevel":"info","LogFileName":"collector.log","LogBuffer":false,"OplogGIDS":"","ShardKey":"collection","SyncerReaderBufferTime":3,"WorkerNum":3,"WorkerOplogCompressor":"none","WorkerBatchQueueSize":64,"Tunnel":"direct","TunnelAddress":["mongodb://172.17.160.242:27001"],"MasterQuorum":true,"ContextStorage":"database","ContextStorageUrl":"mongodb://172.17.160.241:27001,172.17.160.241:27002,172.17.160.241:27003","ContextAddress":"ckpt_default","ContextStartPosition":946684801,"FilterNamespaceBlack":[],"FilterNamespaceWhite":[],"ReplayerDMLOnly":true,"ReplayerExecutor":3,"ReplayerExecutorUpsert":false,"ReplayerExecutorInsertOnDupUpdate":false,"ReplayerCollisionEnable":true,"ReplayerConflictWriteTo":"none","ReplayerDurable":true}
[2018/09/30 16:43:58 CST] [INFO] [dbpool.NewMongoConn:41] New session to mongodb://172.17.160.242:27001 successfully
[2018/09/30 16:43:58 CST] [INFO] [collector.(*Worker).startWorker:110] Collector-worker-0 start working with jobs batch queue. buffer capacity 64
[2018/09/30 16:43:58 CST] [INFO] [dbpool.NewMongoConn:41] New session to mongodb://172.17.160.242:27001 successfully
[2018/09/30 16:43:58 CST] [INFO] [collector.(*Worker).startWorker:110] Collector-worker-1 start working with jobs batch queue. buffer capacity 64
[2018/09/30 16:43:58 CST] [INFO] [dbpool.NewMongoConn:41] New session to mongodb://172.17.160.242:27001 successfully
[2018/09/30 16:43:58 CST] [INFO] [collector.(*OplogSyncer).start:129] Poll oplog syncer start. ckpt_interval[5000ms], gid[], shard_key[collection]
[2018/09/30 16:43:58 CST] [INFO] [collector.(*OplogSyncer).newCheckpointManager:19] Oplog sync create checkpoint manager with [database] [ckpt_default]
[2018/09/30 16:43:58 CST] [INFO] [collector.(*Worker).startWorker:110] Collector-worker-2 start working with jobs batch queue. buffer capacity 64
[2018/09/30 16:43:58 CST] [INFO] [dbpool.NewMongoConn:41] New session to mongodb://172.17.160.241:27001,172.17.160.241:27002,172.17.160.241:27003 successfully
[2018/09/30 16:43:58 CST] [INFO] [ckpt.(*MongoCheckpoint).Get:144] Load exist checkpoint. content &{setone 6606930466106769409}
[2018/09/30 16:43:58 CST] [INFO] [dbpool.NewMongoConn:41] New session to mongodb://172.17.160.241:27001,172.17.160.241:27002,172.17.160.241:27003 successfully
[2018/09/30 16:44:01 CST] [INFO] [executor.(*BarrierMatrix).split:377] Barrier matrix split vector to length 1
[2018/09/30 16:44:01 CST] [INFO] [dbpool.NewMongoConn:41] New session to mongodb://172.17.160.242:27001 successfully
[2018/09/30 16:44:01 CST] [INFO] [dbpool.NewMongoConn:41] New session to mongodb://172.17.160.242:27001 successfully
[2018/09/30 16:44:01 CST] [INFO] [executor.(*Executor).doSync:219] Replayer-2 Executor-8 doSync oplogRecords received[1] merged[1]. merge to 100.00% chunks
[2018/09/30 16:44:01 CST] [INFO] [executor.(*Executor).doSync:219] Replayer-2 Executor-7 doSync oplogRecords received[2] merged[1]. merge to 50.00% chunks
[2018/09/30 16:44:01 CST] [INFO] [collector.(*Worker).transfer:172] Collector-worker-2 transfer retransmit:false send [3] logs. reply_acked [6606934816908640257], list_unack [0]
[2018/09/30 16:44:03 CST] [INFO] [quorum.masterChanged:42] become the master and notify waiter
[2018/09/30 16:44:03 CST] [INFO] [common.(*ReplicationMetric).startup.func1:137] [name=setone, filter=191, get=195, consume=3, apply=3, failed_times=0, success=3, tps=0, ckpt_times=0, retransimit_times=0, tunnel_traffic=344B, lsn_ckpt={0,1970-01-01 08:00:00}, lsn_ack={1538296886,2018-09-30 16:41:26}]
[2018/09/30 16:44:08 CST] [INFO] [common.(*ReplicationMetric).startup.func1:137] [name=setone, filter=192, get=196, consume=3, apply=3, failed_times=0, success=3, tps=0, ckpt_times=0, retransimit_times=0, tunnel_traffic=344B, lsn_ckpt={0,1970-01-01 08:00:00}, lsn_ack={1538296886,2018-09-30 16:41:26}]
[2018/09/30 16:44:13 CST] [INFO] [common.(*ReplicationMetric).startup.func1:137] [name=setone, filter=193, get=197, consume=3, apply=3, failed_times=0, success=3, tps=0, ckpt_times=0, retransimit_times=0, tunnel_traffic=344B, lsn_ckpt={0,1970-01-01 08:00:00}, lsn_ack={1538296886,2018-09-30 16:41:26}]
from mongoshake.
- 数据在不同shard之间挪的时候,老的shard oplog会发生修改,新的shard oplog会发生插入,如果同时读取,由于没有全局时间序,插入和删除无法保证顺序,数据会发生错乱。
- 拉oplog从备上拉,写checkpoint是连主
from mongoshake.
1.关闭balancer,数据自动均衡的特性就失去了,这块有没有些最佳实践,去弥补不做自动balance的不足??
2.对于balance的时序问题,我有个想法,直接从源mongod到目标mongod做端到端的同步,不经过mongos,不就能解决了吗??
from mongoshake.
对于balance的时序问题,我还有个想法,“fromMigrate”的oplog,做一个特殊处理,保证delete在insert之前,是不是就可以了?
from mongoshake.
- 可以按hash方式进行分片,按range方式进行hash会导致某个shard很大
- 不经过mongos对于sharding方式不可取,目的端也有balance的问题,需要知道shard key。而且数据插入到mongod对mongos无感知,mongos无法获悉数据的存在。
- 这个时序没办法保证,代码层面较难实现你说的保证
from mongoshake.
2.只要将目标端和源端的分片和配置都设置一样,然后关闭目的端的balancer,那么目的端就会按照源端各个shard的变化而变化了吧?不过这样需要将configServer也同步一下了。
另外,你说“数据插入到mongod对mongos无感知,mongos无法获悉数据的存在”,这个我测试过是可以的,不经过mongos,只要数据插入的mongod符合路由规则,就能查到~
from mongoshake.
你说的方式是可以。你说的是源和目的的balance都关了吧?这个没啥问题,如果都关了还是写mongos方便,直接cs都不用同步了
from mongoshake.
我在想有没有办法继续用balance功能,所以我想让源继续用balance,目的关闭balance,然后同步是从mongod到mongod的不走mongos,这样目的能够跟随源的oplog一起做balance。但是,这样需要把cs也同步了。。。
from mongoshake.
这部分你可以思考一下。这个sharding同步的问题在shake同步层面很难解,即使解也是各种约束限制。
对于开源来说,最好是按照hash方式,这样只要hash均匀,关掉balance影响也不大。
我们后续会直接修改数据库内核,对Oplog格式进行修改,让shake支持不关balance同步。这部分数据库内核修改后续也可能会考虑开源。
from mongoshake.
谢谢指教
from mongoshake.
这部分你可以思考一下。这个sharding同步的问题在shake同步层面很难解,即使解也是各种约束限制。 对于开源来说,最好是按照hash方式,这样只要hash均匀,关掉balance影响也不大。 我们后续会直接修改数据库内核,对Oplog格式进行修改,让shake支持不关balance同步。这部分数据库内核修改后续也可能会考虑开源。
请问在开启balance时使用这个工具有解决方案了吗
from mongoshake.
使用changestream能力迁移,不用关balance
这会坑的你不要不要的 changestream不可靠
from mongoshake.
Related Issues (20)
- mongoshake 源副本集群存在延迟应用节点,后台报错
- 貌似用户迁移不了? HOT 1
- Using MongoShake with MongoDB ReplicaSet on Kubernetes.
- mongoshake使用all 模式完成全量同步,开始增量同步时报错进程自动退出
- 使用all模式同步时,日志中出现run splitVector failed[cannot Decode to nil value], give up parallel fetching
- 怎么彻底停止monghShake防止hypervisor重启。 HOT 1
- 全量同步完成后,写入一个非法的checkpoint[9223372036854775807[2147483647, 4294967295]] HOT 1
- 能否支持arm架构 HOT 2
- mongoshake同步使用all模式导致目标端主库服务器oom ,mongodb进程被kill
- 是否支持同时从多个源向目标数据库同步? HOT 1
- filter过滤collection不生效 HOT 1
- can i use this tool to sync incr ops from replset mongodb(as src) to standalone mongodb(as dst)?
- 关于权限问题
- mac m1 无法启动 ./collector.linux -conf=collector.conf
- Bad checksum?
- crash
- mongo-shake支持同步kafka显示修改数据的整条数据吗?目前是只同步修改了的字段的前后信息
- mongoshake mongodb版本V5.0 同步到kafka一直提示There has no oplog collection in mongo db server,mogodb已开启oplog HOT 2
- tps加入配置
- 全量之后没有增量同步
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mongoshake.