Giter VIP home page Giter VIP logo

tidb-binlog's Introduction

TiDB-Binlog

Build Status Coverage Status Go Report Card

TiDB-Binlog introduction

TiDB-Binlog is a tool used to collect TiDB's binary logs with the following features:

  • Data replication

    Synchronize data from the TiDB cluster to heterogeneous databases.

  • Real-time backup and recovery

    Backup the TiDB cluster into the Dump file and it can be used for recovery.

  • Multiple output format

    Support MySQL, Dump file, etc.

  • History replay

    Replay from any history point.

Documentation

Architecture

architecture

Service list

Pump

Pump is a daemon that receives real-time binlogs from tidb-server and writes in sequential disk files synchronously.

Drainer

Drainer collects binlogs from each Pump in the cluster, transforms binlogs to various dialects of SQL, and applies to the downstream database or filesystem.

How to build

To check the code style and build binaries, you can simply run:

make build   # build all components

If you only want to build binaries, you can run:

make pump  # build pump

make drainer  # build drainer

When TiDB-Binlog is built successfully, you can find the binary in the bin directory.

Run Test

Run all tests, including unit test and integration test

make test

See tests for how to execute and add integration tests.

Deployment

The recommended startup sequence: PD -> TiKV -> Pump -> TiDB -> Drainer

The best way to install TiDB-Binlog is via TiDB-Binlog-Ansible

Tutorial

Here's a tutorial to experiment with TiDB-Binlog (not for production use).

Config File

Contributing

Contributions are welcomed and greatly appreciated. See CONTRIBUTING.md for details on submitting patches and the contribution workflow.

License

TiDB-Binlog is under the Apache 2.0 license. See the LICENSE file for details.

tidb-binlog's People

Contributors

3pointer avatar amyangfei avatar caitinchen avatar cartersz avatar csuzhangxc avatar djshow832 avatar ericsyh avatar freemindli avatar glorv avatar gmhdbjd avatar hicqu avatar holys avatar hongyunyan avatar iamxy avatar ianthereal avatar july2993 avatar kennytm avatar kolbe avatar lcwangchao avatar lichunzhu avatar lysu avatar nolouch avatar rleungx avatar suzaku avatar ti-chi-bot avatar tiancaiamao avatar wangxiangustc avatar wuhuizuo avatar xiongjiwei avatar zanmato1984 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tidb-binlog's Issues

Recover Table cause drainer exit in TiDB 3.0.0

The drainer exit When we do table recover command on TiDB cluster, I think this is a bug.

TiDB sql command

drop table sbtest4; 
recover table sysbench.sbtest4 ; 
// check the ddl jobs

mysql> admin show ddl jobs
    -> ;
+--------+----------+------------+---------------+--------------+-----------+----------+-----------+-----------------------------------+--------+
| JOB_ID | DB_NAME  | TABLE_NAME | JOB_TYPE      | SCHEMA_STATE | SCHEMA_ID | TABLE_ID | ROW_COUNT | START_TIME                        | STATE  |
+--------+----------+------------+---------------+--------------+-----------+----------+-----------+-----------------------------------+--------+
|     64 | sysbench | sbtest5    | create table  | public       |        41 |       63 |         0 | 2019-07-09 19:07:38.022 +0800 CST | synced |
|     62 | db24     |            | create schema | public       |        61 |        0 |         0 | 2019-07-09 15:19:22.836 +0800 CST | synced |
|     60 | sysbench |            | drop table    | none         |        41 |       43 |         0 | 2019-07-09 13:20:28.702 +0800 CST | synced |
|     59 |          |            | drop schema   | none         |        57 |        0 |         0 | 2019-07-09 13:11:52.302 +0800 CST | synced |
|     58 |          |            | create schema | public       |        57 |        0 |         0 | 2019-07-09 10:08:47.516 +0800 CST | synced |
|     56 | sysbench | sbtest4    | recover table | public       |        41 |       49 |         0 | 2019-07-09 09:57:17.215 +0800 CST | synced |
|     55 | sysbench | sbtest4    | drop table    | none         |        41 |       49 |         0 | 2019-07-09 09:56:54.265 +0800 CST | synced |
|     54 | sysbench | sbtest3    | add index     | public       |        41 |       52 |     10000 | 2019-07-09 09:47:21.565 +0800 CST | synced |
|     53 | sysbench | sbtest3    | create table  | public       |        41 |       52 |         0 | 2019-07-09 09:47:19.815 +0800 CST | synced |
|     51 | sysbench | sbtest4    | add index     | public       |        41 |       49 |     10000 | 2019-07-09 09:47:18.565 +0800 CST | synced |
+--------+----------+------------+---------------+--------------+-----------+----------+-----------+-----------------------------------+--------+

Drainer Error

[2019/07/09 13:20:30.290 +08:00] [INFO] [collector.go:280] ["get ddl job"] [job="ID:60, Type:drop table, State:synced, SchemaState:none, SchemaID:41, TableID:43, RowCount:0, ArgLen:0, start time: 2019-07-09 13:20:28.702 +0800 CST, Err:<nil>, ErrCount:0, SnapshotVersion:0"]
[2019/07/09 13:20:30.291 +08:00] [ERROR] [server.go:267] ["syncer exited abnormal"] [
    error="handle ddl job ID:56, Type:none, State:synced, SchemaState:public, SchemaID:41, TableID:49, RowCount:0, ArgLen:0, start time: 2019-07-09 09:57:17.215 +0800 CST, Err:[meta:1146]table doesn't exist, ErrCount:1, SnapshotVersion:0 failed, the schema info: {\n\t\thasImplicitCol: false,\n\t\tschemaMetaVersion: 0,\n\t\tschemaNameToID: {\n\t\t\tmysql: 3,\n\t\t\tsysbench: 41,\n\t\t\ttest: 1\n\t\t},\n\t\ttableIDToName: {\n\t\t\t11: {\n\t\t\t\tdb-name: mysql,\n\t\t\t\ttbl-name: columns_priv\n\t\t\t},\n\t\t\t13: {\n\t\t\t\tdb-name: mysql,\n\t\t\t\ttbl-name: GLOBAL_VARIABLES\n\t\t\t},\n\t\t\t15: {\n\t\t\t\tdb-name: mysql,\n\t\t\t\ttbl-name: tidb\n\t\t\t},\n\t\t\t17: {\n\t\t\t\tdb-name: mysql,\n\t\t\t\ttbl-name: help_topic\n\t\t\t},\n\t\t\t19: {\n\t\t\t\tdb-name: mysql,\n\t\t\t\ttbl-name: stats_meta\n\t\t\t},\n\t\t\t21: {\n\t\t\t\tdb-name: mysql,\n\t\t\t\ttbl-name: stats_histograms\n\t\t\t},\n\t\t\t23: {\n\t\t\t\tdb-name: mysql,\n\t\t\t\ttbl-name: stats_buckets\n\t\t\t},\n\t\t\t25: {\n\t\t\t\tdb-name: mysql,\n\t\t\t\ttbl-name: gc_delete_range\n\t\t\t},\n\t\t\t27: {\n\t\t\t\tdb-name: mysql,\n\t\t\t\ttbl-name: gc_delete_range_done\n\t\t\t},\n\t\t\t29: {\n\t\t\t\tdb-name: mysql,\n\t\t\t\ttbl-name: stats_feedback\n\t\t\t},\n\t\t\t31: {\n\t\t\t\tdb-name: mysql,\n\t\t\t\ttbl-name: role_edges\n\t\t\t},\n\t\t\t33: {\n\t\t\t\tdb-name: mysql,\n\t\t\t\ttbl-name: default_roles\n\t\t\t},\n\t\t\t35: {\n\t\t\t\tdb-name: mysql,\n\t\t\t\ttbl-name: bind_info\n\t\t\t},\n\t\t\t37: {\n\t\t\t\tdb-name: mysql,\n\t\t\t\ttbl-name: stats_top_n\n\t\t\t},\n\t\t\t39: {\n\t\t\t\tdb-name: mysql,\n\t\t\t\ttbl-name: expr_pushdown_blacklist\n\t\t\t},\n\t\t\t43: {\n\t\t\t\tdb-name: sysbench,\n\t\t\t\ttbl-name: sbtest1\n\t\t\t},\n\t\t\t44: {\n\t\t\t\tdb-name: sysbench,\n\t\t\t\ttbl-name: sbtest2\n\t\t\t},\n\t\t\t5: {\n\t\t\t\tdb-name: mysql,\n\t\t\t\ttbl-name: user\n\t\t\t},\n\t\t\t52: {\n\t\t\t\tdb-name: sysbench,\n\t\t\t\ttbl-name: sbtest3\n\t\t\t},\n\t\t\t7: {\n\t\t\t\tdb-name: mysql,\n\t\t\t\ttbl-name: db\n\t\t\t},\n\t\t\t9: {\n\t\t\t\tdb-name: mysql,\n\t\t\t\ttbl-name: tables_priv\n\t\t\t}\n\t\t}\n\t}: table sbtest4(49) not found"] [errorVerbose="table sbtest4(49) not found\n
github.com/pingcap/errors.NotFound

This mean that handing the ddl job ( recover table ) issue the error info.

I check the source code of drainer , add the skip logic for the ActionRecoverTable ddl job.
build a new drainer version for test, it worked .

// schema.go line 393
// 25 means: job type is ActionRecoverTable
        if (job.Type==25){
			 log.Info("ActionRecoverTable Occur Skip ~~~~~ ", zap.Int64("Job id:", job.ID), zap.Uint8(" type:", uint8(job.Type)), zap.Int64(" SchemaID:", job.SchemaID), zap.Int64(" TableID:",job.TableID))
			return
		}

ordering guarantees when the downstream is Kafka

How's the ordering guarantees when the downstream is Kafka?
With multi partition of Kafka Topic, how will drainer send to different partition?
If only using one partition for ordering guarantees, will the throughput enough to be use?

Make integration tests faster (eg. finished in less than 3 minutes)

The task of running integration tests is the bottleneck of the build pipeline.
We may try to make it faster by:

  1. Reducing the number of SQLs tested, may be some of the tests should belong to chaos testing instead of integration testing;
  2. Split the task into multiple ones and run them in parallel

validate Config.SyncerConfig.To is a nullptr or not for fix "drainer panic: runtime error"

launch command and arguments:

$ ~/.go/src/github.com/pingcap/tidb-binlog/bin/drainer --pd-urls=http://arch-dev:2379 --addr=arch-dev:8249 --data-dir=./drainer --L=debug

error logs and stdout:

2017/04/19 11:29:20 version.go:23: [info] Build TS: 2017-04-19 03:28:59
2017/04/19 11:29:20 version.go:24: [info] Go Version: go1.8.1
2017/04/19 11:29:20 version.go:25: [info] Go OS/Arch: linuxamd64
2017/04/19 11:29:20 client.go:97: [info] [pd] create pd client with endpoints [http://arch-dev:2379]
2017/04/19 11:29:20 client.go:179: [info] [pd] leader switches to: http://192.168.56.200:2379, previous:
2017/04/19 11:29:20 client.go:115: [info] [pd] init cluster id 6409853286206803787
2017/04/19 11:29:20 server.go:78: [info] clusterID of drainer server is 6409853286206803787
2017/04/19 11:29:20 client.go:273: [error] [pd] create tso stream error: rpc error: code = 1 desc = context canceled
2017/04/19 11:29:20 client.go:97: [info] [pd] create pd client with endpoints [http://arch-dev:2379]
2017/04/19 11:29:20 client.go:179: [info] [pd] leader switches to: http://192.168.56.200:2379, previous:
2017/04/19 11:29:20 client.go:115: [info] [pd] init cluster id 6409853286206803787
2017/04/19 11:29:20 client.go:97: [info] [pd] create pd client with endpoints [arch-dev:2379]
2017/04/19 11:29:20 client.go:179: [info] [pd] leader switches to: http://192.168.56.200:2379, previous:
2017/04/19 11:29:20 client.go:115: [info] [pd] init cluster id 6409853286206803787
2017/04/19 11:29:20 scan.go:132: [debug] txn getData nextStartKey["mDDLJobHi\xffstory\x00\x00\x00\xfc\x00\x00\x00\x00\x00\x00\x00h"], txn 391268941171523587
2017/04/19 11:29:30 collector.go:167: [info] node arch-dev:8250 get save point {0 0}
2017/04/19 11:29:40 schema.go:38: [info] [local schema/table] map[29:{test t1} 41:{test a} 44:{test b} 35:{test c}]
2017/04/19 11:29:40 schema.go:39: [info] [local schema] map[1:0xc4203e24d0]
2017/04/19 11:29:40 schema.go:40: [info] [ignore schema] map[7:{}]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0xdf2919]

goroutine 168 [running]:
github.com/pingcap/tidb-binlog/drainer/executor.newMysql(0x0, 0xc420155b30, 0x44239b, 0x10, 0xf24220)
        /home/qupeng/.go/src/github.com/pingcap/tidb-binlog/drainer/executor/mysql.go:14 +0x29
github.com/pingcap/tidb-binlog/drainer/executor.New(0x1049008, 0x5, 0x0, 0xc42033f1a0, 0x0, 0x1, 0x0)
        /home/qupeng/.go/src/github.com/pingcap/tidb-binlog/drainer/executor/executor.go:19 +0x149
github.com/pingcap/tidb-binlog/drainer.createExecutors(0x1049008, 0x5, 0x0, 0x1, 0x1048aea, 0x5, 0x1047314, 0x2, 0x1047f9a)
        /home/qupeng/.go/src/github.com/pingcap/tidb-binlog/drainer/util.go:195 +0xab
github.com/pingcap/tidb-binlog/drainer.(*Syncer).run(0xc42025c240, 0xc420447f50, 0x18, 0x20)
        /home/qupeng/.go/src/github.com/pingcap/tidb-binlog/drainer/syncer.go:484 +0x9c
github.com/pingcap/tidb-binlog/drainer.(*Syncer).Start(0xc42025c240, 0xc420406000, 0x18, 0x20, 0x0, 0x0)
        /home/qupeng/.go/src/github.com/pingcap/tidb-binlog/drainer/syncer.go:86 +0xa7
github.com/pingcap/tidb-binlog/drainer.(*Server).StartSyncer.func1(0xc4201bb180, 0xc420406000, 0x18, 0x20)
        /home/qupeng/.go/src/github.com/pingcap/tidb-binlog/drainer/server.go:175 +0x81
created by github.com/pingcap/tidb-binlog/drainer.(*Server).StartSyncer
        /home/qupeng/.go/src/github.com/pingcap/tidb-binlog/drainer/server.go:180 +0x7a

when sync to mysql, data may not consitent with timestamp data type if explicit_defaults_for_timestamp var diffreent

in tidb will like this explicit_defaults_for_timestamp is on as default

mysql> create table tm(id int auto_increment, t_timestamp TIMESTAMP, primary key(id));
Query OK, 0 rows affected (0.24 sec)

mysql> show create table tm;
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                        |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tm    | CREATE TABLE `tm` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `t_timestamp` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.02 sec)

mysql> insert into tm(t_timestamp) values(null);
Query OK, 1 row affected (0.13 sec)

mysql> select * from tm;
+----+-------------+
| id | t_timestamp |
+----+-------------+
|  1 | NULL        |
+----+-------------+
1 row in set (0.01 sec)

in mysql if explicit_defaults_for_timestamp is off will like this
for mysql Default Value is ON(>= 8.0.2) or OFF(<= 8.0.1)

mysql> create table tm(id int auto_increment, t_timestamp TIMESTAMP, primary key(id));
Query OK, 0 rows affected (0.03 sec)

mysql> show create table tm;
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                    |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tm    | CREATE TABLE `tm` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `t_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

mysql> insert into tm(t_timestamp) values(null);
Query OK, 1 row affected (0.00 sec)

mysql> select * from tm;
+----+---------------------+
| id | t_timestamp         |
+----+---------------------+
|  1 | 2018-05-31 12:55:51 |
+----+---------------------+
1 row in set (0.01 sec)

about explicit_defaults_for_timestamp

pump should retry notifying registered drainers instead of exiting on startup

When pump starts, it seems to try only one time to notify registered drainers, and it exits if it fails:

2019/04/24 12:14:30 version.go:33: [info] Release Version: v3.0.0-beta.1-31-g1a25971
2019/04/24 12:14:30 version.go:34: [info] Git Commit Hash: 1a259716e0aaef328f94e124e02e90be58d756dc
2019/04/24 12:14:30 version.go:35: [info] Build TS: 2019-04-23 12:42:17
2019/04/24 12:14:30 version.go:36: [info] Go Version: go1.11.2
2019/04/24 12:14:30 version.go:37: [info] Go OS/Arch: linuxamd64
2019/04/24 12:14:32 server.go:116: [info] clusterID of pump server is 6683140680737258511
2019/04/24 12:14:32 storage.go:116: [info] options: &{ValueLogFileSize:524288000 Sync:true KVConfig:<nil>}
2019/04/24 12:14:32 storage.go:1090: [info] Storage config: &{BlockCacheCapacity:8388608 BlockRestartInterval:16 BlockSize:4096 CompactionL0Trigger:8 CompactionTableSize:67108864 CompactionTotalSize:536870912 CompactionTotalSizeMultiplier:8 WriteBuffer:67108864 WriteL0PauseTrigger:24 WriteL0SlowdownTrigger:17}
2019/04/24 12:14:32 storage.go:188: [info] gcTS: 407757428283932672, maxCommitTS: 407928127082725377, headPointer: {Fid:1 Offset:2000}, handlePointer: {Fid:1 Offset:2000}
2019/04/24 12:14:32 server.go:316: [info] register success, this pump's node id is localhost.localdomain:8250
2019/04/24 12:14:32 node.go:160: [info] start try to notify drainer:  192.168.236.157:8249
2019/04/24 12:14:32 main.go:71: [error] pump server error, fail to notify all living drainer: notify drainer(192.168.236.157:8249); but return error(rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 192.168.236.157:8249: connect: connection refused")

pump should retry for some time period of number of retries before exiting. This would greatly simplify cluster startup, and would follow the example of other components that handle startup order more flexibly.

drainer process cannot up when hostname changed

Hi,i come across a problem when hostname changed .
at first every goes ok,then i shutdown drainer process,at a while,the drainer server hostname was changed by our jobs ,after that i start drainer process,it automately register a new nodeid and try to start two drainer process by using the same port.
accordig to tidb document ,i make the old node-id state offline by using tool of binlogctl ,but still not work.some logs as blow

alfter hostname change twice ,there are three drainer service using the same port in the same host
+----------------------------------------------------+-------------------+--------+--------------------+---------------------+
| NodeID | Address | State | Max_Commit_Ts | Update_Time |
+----------------------------------------------------+-------------------+--------+--------------------+---------------------+
| gdxx-db-xxx060026-db-tidb.xxxxxxxxxxx.com8249:8249 | xxxxxxxx:8249 | online | 409660349216456705 | 2019-07-10 15:17:19 |
| gdxx-db-213060026-db-xxxxxxxxxxx.com:8249 | xxxxxxxx:8249 | online | 409660349216456705 | 2019-07-10 15:16:46 |
| GDxx-DB-213060026-db-xxxxxxxxxxx.com:8249 | xxxxxxxx:8249 | paused | 409660349216456705 | 2019-07-10 11:43:36 |
+----------------------------------------------------+-------------------+--------+--------------------+---------------------+

after make it state offline ,it still cannot startup drainer process normal

2019/07/10 15:45:37 main.go:53: [error] start drainer server error, [tikv:9001]PD server timeout[try again later]
github.com/pingcap/errors.AddStack
/home/jenkins/workspace/release_tidb_2.1-ga/go/pkg/mod/github.com/pingcap/[email protected]/errors.go:174
github.com/pingcap/parser/terror.(*Error).GenWithStackByArgs
/home/jenkins/workspace/release_tidb_2.1-ga/go/pkg/mod/github.com/pingcap/[email protected]/terror/terror.go:231
github.com/pingcap/tidb/store/tikv.backoffType.TError
/home/jenkins/workspace/release_tidb_2.1-ga/go/pkg/mod/github.com/pingcap/[email protected]+incompatible/store/tikv/backoff.go:146
github.com/pingcap/tidb/store/tikv.(*Backoffer).Backoff
/home/jenkins/workspace/release_tidb_2.1-ga/go/pkg/mod/github.com/pingcap/[email protected]+incompatible/store/tikv/backoff.go:248
github.com/pingcap/tidb/store/tikv.(*RegionCache).loadRegion
/home/jenkins/workspace/release_tidb_2.1-ga/go/pkg/mod/github.com/pingcap/[email protected]+incompatible/store/tikv/region_cache.go:333
github.com/pingcap/tidb/store/tikv.(*RegionCache).LocateKey
/home/jenkins/workspace/release_tidb_2.1-ga/go/pkg/mod/github.com/pingcap/[email protected]+incompatible/store/tikv/region_cache.go:152
github.com/pingcap/tidb/store/tikv.(*Scanner).getData
/home/jenkins/workspace/release_tidb_2.1-ga/go/pkg/mod/github.com/pingcap/[email protected]+incompatible/store/tikv/scan.go:151
github.com/pingcap/tidb/store/tikv.(*Scanner).Next
/home/jenkins/workspace/release_tidb_2.1-ga/go/pkg/mod/github.com/pingcap/[email protected]+incompatible/store/tikv/scan.go:92
github.com/pingcap/tidb/store/tikv.newScanner
/home/jenkins/workspace/release_tidb_2.1-ga/go/pkg/mod/github.com/pingcap/[email protected]+incompatible/store/tikv/scan.go:51
github.com/pingcap/tidb/store/tikv.(*tikvSnapshot).Iter
/home/jenkins/workspace/release_tidb_2.1-ga/go/pkg/mod/github.com/pingcap/[email protected]+incompatible/store/tikv/snapshot.go:285
github.com/pingcap/tidb/structure.(*TxStructure).iterateHash
/home/jenkins/workspace/release_tidb_2.1-ga/go/pkg/mod/github.com/pingcap/[email protected]+incompatible/structure/hash.go:241
github.com/pingcap/tidb/structure.(*TxStructure).HGetAll
/home/jenkins/workspace/release_tidb_2.1-ga/go/pkg/mod/github.com/pingcap/[email protected]+incompatible/structure/hash.go:207
github.com/pingcap/tidb/meta.(*Meta).GetAllHistoryDDLJobs
/home/jenkins/workspace/release_tidb_2.1-ga/go/pkg/mod/github.com/pingcap/[email protected]+incompatible/meta/meta.go:663
github.com/pingcap/tidb-binlog/drainer.loadHistoryDDLJobs
/home/jenkins/workspace/release_tidb_2.1-ga/go/src/github.com/pingcap/tidb-binlog/drainer/util.go:83
github.com/pingcap/tidb-binlog/drainer.(*Server).Start
/home/jenkins/workspace/release_tidb_2.1-ga/go/src/github.com/pingcap/tidb-binlog/drainer/server.go:306
main.main
/home/jenkins/workspace/release_tidb_2.1-ga/go/src/github.com/pingcap/tidb-binlog/cmd/drainer/main.go:52
runtime.main
/usr/local/go/src/runtime/proc.go:200
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1337

drainer crashes on VIEW ddl

If the drainer encounters a CREATE VIEW statement, it crashes:

2019/04/05 14:24:59 schema.go:235: [debug] handle ddl job id(1109): {"id":1109,"type":21,"schema_id":1062,"table_id":1108,"state":6,"err":null,"err_count":0,"
row_count":0,"raw_args":null,"schema_state":5,"snapshot_ver":0,"start_ts":407500477872209971,"dependency_id":0,"query":"CREATE VIEW v1 AS SELECT JSON_TYPE(JSO
N_OBJECT());","binlog":{"SchemaVersion":1441,"DBInfo":null,"TableInfo":{"id":1108,"name":{"O":"v1","L":"v1"},"charset":"utf8","collate":"utf8_general_ci","col
s":[{"id":1,"name":{"O":"JSON_TYPE(JSON_OBJECT())","L":"json_type(json_object())"},"offset":0,"origin_default":null,"default":null,"default_bit":null,"generat
ed_expr_string":"","generated_stored":false,"dependences":null,"type":{"Tp":0,"Flag":0,"Flen":0,"Decimal":0,"Charset":"","Collate":"","Elems":null},"state":5,
"comment":""}],"index_info":null,"fk_info":null,"state":5,"pk_is_handle":false,"comment":"","auto_inc_id":0,"max_col_id":1,"max_idx_id":0,"update_timestamp":4
07500477885317127,"ShardRowIDBits":0,"partition":null,"compression":""},"FinishedTS":407500477898948611},"version":1,"reorg_meta":null,"priority":0}
2019/04/05 14:24:59 syncer.go:638: [debug] receive publish binlog item: {startTS: 407502545135861761, commitTS: 407502545135861764, node: seastar.local:8250}
2019/04/05 14:24:59 server.go:225: [error] syncer exited, error table v1(1108) not found
github.com/pingcap/errors.NotFoundf
        /Users/kolbe/Devel/go/pkg/mod/github.com/pingcap/[email protected]/juju_adaptor.go:72
github.com/pingcap/tidb-binlog/drainer.(*Schema).ReplaceTable
        /Users/kolbe/Devel/go/src/github.com/pingcap/tidb-binlog/drainer/schema.go:183
github.com/pingcap/tidb-binlog/drainer.(*Schema).handleDDL
        /Users/kolbe/Devel/go/src/github.com/pingcap/tidb-binlog/drainer/schema.go:397
github.com/pingcap/tidb-binlog/drainer.(*Schema).handlePreviousDDLJobIfNeed
        /Users/kolbe/Devel/go/src/github.com/pingcap/tidb-binlog/drainer/schema.go:238
github.com/pingcap/tidb-binlog/drainer.(*Syncer).run
        /Users/kolbe/Devel/go/src/github.com/pingcap/tidb-binlog/drainer/syncer.go:496
github.com/pingcap/tidb-binlog/drainer.(*Syncer).Start
        /Users/kolbe/Devel/go/src/github.com/pingcap/tidb-binlog/drainer/syncer.go:96
github.com/pingcap/tidb-binlog/drainer.(*Server).StartSyncer.func1
        /Users/kolbe/Devel/go/src/github.com/pingcap/tidb-binlog/drainer/server.go:223
runtime.goexit
        /usr/local/Cellar/go/1.12.1/libexec/src/runtime/asm_amd64.s:1337

drainer failed to exec ddl if the downstream use proxysql

version:v2.1.4
syncer.to mysql(is a TiDB)

sql:

CREATE TABLE films (
  id int(11),
  release_year int(11),
  category_id int(11),
  rating decimal(3,2)
);
insert into films values
(1,2015,1,8.00),
(2,2015,2,8.50),
(3,2015,3,9.00),
(4,2016,2,8.20),
(5,2016,1,8.40),
(6,2017,2,7.00);

error log :

{"log":"2019/07/03 16:44:08 pump.go:115: \u001b[0;37m[info] [pump 13274bc26df3:10060] create pull binlogs client\u001b[0m\n","stream":"stderr","time":"2019-07-03T08:44:08.719759354Z"}
{"log":"2019/07/03 16:44:08 pump.go:115: \u001b[0;37m[info] [pump 33576518e473:10060] create pull binlogs client\u001b[0m\n","stream":"stderr","time":"2019-07-03T08:44:08.719763484Z"}
{"log":"2019/07/03 16:44:08 pump.go:115: \u001b[0;37m[info] [pump 7410ddec094a:10060] create pull binlogs client\u001b[0m\n","stream":"stderr","time":"2019-07-03T08:44:08.719767392Z"}
{"log":"2019/07/03 16:44:09 syncer.go:503: \u001b[0;37m[info] [ddl][start]use `binlog`; CREATE TABLE films (\n","stream":"stderr","time":"2019-07-03T08:44:09.669498645Z"}
{"log":"  id int(11),\n","stream":"stderr","time":"2019-07-03T08:44:09.669534124Z"}
{"log":"  release_year int(11),\n","stream":"stderr","time":"2019-07-03T08:44:09.669542576Z"}
{"log":"  category_id int(11),\n","stream":"stderr","time":"2019-07-03T08:44:09.669549275Z"}
{"log":"  rating decimal(3,2)\n","stream":"stderr","time":"2019-07-03T08:44:09.669581963Z"}
{"log":");[commit ts]409506059518476290\u001b[0m\n","stream":"stderr","time":"2019-07-03T08:44:09.669588491Z"}
{"log":"2019/07/03 16:44:09 sql.go:107: \u001b[0;31m[error] exec sqls[[use `binlog`; CREATE TABLE films (\n","stream":"stderr","time":"2019-07-03T08:44:09.679585095Z"}
{"log":"  id int(11),\n","stream":"stderr","time":"2019-07-03T08:44:09.679603484Z"}
{"log":"  release_year int(11),\n","stream":"stderr","time":"2019-07-03T08:44:09.679610162Z"}
{"log":"  category_id int(11),\n","stream":"stderr","time":"2019-07-03T08:44:09.67963325Z"}
{"log":"  rating decimal(3,2)\n","stream":"stderr","time":"2019-07-03T08:44:09.679637648Z"}
{"log":");]] commit failed Error 1105: line 1 column 12 near \"`; CREATE TABLE films (\n","stream":"stderr","time":"2019-07-03T08:44:09.679641773Z"}
{"log":"  id int(11),\n","stream":"stderr","time":"2019-07-03T08:44:09.67964633Z"}
{"log":"  release_year int(11),\n","stream":"stderr","time":"2019-07-03T08:44:09.679650288Z"}
{"log":"  category_id int(11),\n","stream":"stderr","time":"2019-07-03T08:44:09.679654275Z"}
{"log":"  rating decimal(3,2)\n","stream":"stderr","time":"2019-07-03T08:44:09.679658037Z"}
{"log":")`\" (total length 121)\u001b[0m\n","stream":"stderr","time":"2019-07-03T08:44:09.679661988Z"}

and the docker whitch include drainer restart agen and agen

drainer tries to create MySQL timestamp column with illegal default value

2019/04/05 15:26:01 syncer.go:516: [info] [ddl][start]use `timestamp_insert`; create table t (id int, c1 timestamp default null);;[commit ts]407502548517519418
2019/04/05 15:26:01 syncer.go:211: [debug] add job: {binlogTp: 2, mutationTp: Insert, sql: use `timestamp_insert`; create table t (id int, c1 timestamp default null);;, args: [], key: , commitTS: 407502548517519418, nodeID: seastar.local:8250}
2019/04/05 15:26:01 sql.go:87: [error] [exec][sql]use `timestamp_insert`; create table t (id int, c1 timestamp default null);;[args][][error]Error 1067: Invalid default value for 'c1'
2019/04/05 15:26:03 syncer.go:638: [debug] receive publish binlog item: {startTS: 407503677174579201, commitTS: 407503677174579201, node: seastar.local:8250}
2019/04/05 15:26:04 sql.go:87: [error] [exec][sql]use `timestamp_insert`; create table t (id int, c1 timestamp default null);;[args][][error]Error 1067: Invalid default value for 'c1'
2019/04/05 15:26:06 syncer.go:638: [debug] receive publish binlog item: {startTS: 407503677973856257, commitTS: 407503677973856257, node: seastar.local:8250}
2019/04/05 15:26:07 sql.go:87: [error] [exec][sql]use `timestamp_insert`; create table t (id int, c1 timestamp default null);;[args][][error]Error 1067: Invalid default value for 'c1'
2019/04/05 15:26:09 syncer.go:638: [debug] receive publish binlog item: {startTS: 407503678760288257, commitTS: 407503678760288257, node: seastar.local:8250}
2019/04/05 15:26:10 sql.go:87: [error] [exec][sql]use `timestamp_insert`; create table t (id int, c1 timestamp default null);;[args][][error]Error 1067: Invalid default value for 'c1'
2019/04/05 15:26:12 syncer.go:638: [debug] receive publish binlog item: {startTS: 407503679547244545, commitTS: 407503679547244545, node: seastar.local:8250}
2019/04/05 15:26:13 sql.go:87: [error] [exec][sql]use `timestamp_insert`; create table t (id int, c1 timestamp default null);;[args][][error]Error 1067: Invalid default value for 'c1'
2019/04/05 15:26:13 syncer.go:360: [fatal] Error 1067: Invalid default value for 'c1'
github.com/pingcap/errors.AddStack
        /Users/kolbe/Devel/go/pkg/mod/github.com/pingcap/[email protected]/errors.go:174
github.com/pingcap/errors.Trace
        /Users/kolbe/Devel/go/pkg/mod/github.com/pingcap/[email protected]/juju_adaptor.go:12
github.com/pingcap/tidb-binlog/pkg/sql.ExecuteTxnWithHistogram
        /Users/kolbe/Devel/go/src/github.com/pingcap/tidb-binlog/pkg/sql/sql.go:92
github.com/pingcap/tidb-binlog/pkg/sql.ExecuteSQLsWithHistogram
        /Users/kolbe/Devel/go/src/github.com/pingcap/tidb-binlog/pkg/sql/sql.go:58
github.com/pingcap/tidb-binlog/drainer/executor.(*mysqlExecutor).Execute
        /Users/kolbe/Devel/go/src/github.com/pingcap/tidb-binlog/drainer/executor/mysql.go:33
github.com/pingcap/tidb-binlog/drainer.execute
        /Users/kolbe/Devel/go/src/github.com/pingcap/tidb-binlog/drainer/util.go:125
github.com/pingcap/tidb-binlog/drainer.(*Syncer).sync
        /Users/kolbe/Devel/go/src/github.com/pingcap/tidb-binlog/drainer/syncer.go:356
runtime.goexit
        /usr/local/Cellar/go/1.12.1/libexec/src/runtime/asm_amd64.s:1337```

binlog's commit ts less than last ts

When I start drainer, there are lots of error says commit ts is less than last ts
I started drainer like this

sudo ./drainer -config ../conf/drainer.toml -initial-commit-ts 410003709545414657 &

and the log is

2019/07/25 15:56:33 merge.go:305: [error] binlog's commit ts is 410003997877600257, and is less than the last ts 410003997877600257
2019/07/25 15:56:33 merge.go:305: [error] binlog's commit ts is 410003998703353857, and is less than the last ts 410003998703353857
2019/07/25 15:56:33 merge.go:305: [error] binlog's commit ts is 410003999516000257, and is less than the last ts 410003999516000257
2019/07/25 15:56:33 syncer.go:297: [info] [write save point]410003997169811457
2019/07/25 15:56:33 merge.go:305: [error] binlog's commit ts is 410004000328646657, and is less than the last ts 410004000328646657
2019/07/25 15:56:33 merge.go:305: [error] binlog's commit ts is 410004001141293057, and is less than the last ts 410004001141293057
2019/07/25 15:56:33 merge.go:305: [error] binlog's commit ts is 410004001953939458, and is less than the last ts 410004001953939458
2019/07/25 15:56:33 merge.go:305: [error] binlog's commit ts is 410004002753478659, and is less than the last ts 410004002753478659
2019/07/25 15:56:33 merge.go:305: [error] binlog's commit ts is 410004003566125057, and is less than the last ts 410004003566125057
2019/07/25 15:56:33 merge.go:305: [error] binlog's commit ts is 410004004378771457, and is less than the last ts 410004004378771457
2019/07/25 15:56:33 merge.go:305: [error] binlog's commit ts is 410004005217632257, and is less than the last ts 410004005217632257

and it do not sync any sql executed unless I before restarting drainer. If I restart it, it will sync the operation I execute before and stop syncing again.

tidb-binlog/diff don't suppost json type

for tidb

     t_json: {"key1":"value1","key2":"value2"}
     1 row in set (0.00 sec)

for mysql

     t_json: {"key1": "value1", "key2": "value2"}
     1 row in set (0.00 sec)

note the space after , and :,the data may different, but we should treat it as consistent, it will fail now

pump should log version info before other logs

2018/04/09 19:35:03 Connected to 10.7.104.184:2181
2018/04/09 19:35:03 Authenticated: id=144229920926728194, timeout=40000
2018/04/09 19:35:03 Re-submitting `0` credentials after reconnect
2018/04/09 19:35:03 config.go:235: [info] get kafka addrs from zookeeper: 10.7.232.216:9092,10.7.104.184:9092,10.7.113.43:9092
2018/04/09 19:35:03 Recv loop terminated: err=EOF
2018/04/09 19:35:03 Send loop terminated: err=<nil>
2018/04/09 19:35:03 version.go:18: [info] Git Commit Hash: 2a7761f992dbf13ed8ccf59ca485e9c5cb5143d6
2018/04/09 19:35:03 version.go:19: [info] Build TS: 2018-04-02 10:27:34
2018/04/09 19:35:03 version.go:20: [info] Go Version: go1.10
2018/04/09 19:35:03 version.go:21: [info] Go OS/Arch: linuxamd64
time="2018-04-09T19:35:03+08:00" level=info msg="[pd] create pd client with endpoints [http://jira-cluster-pd:2379]"
time="2018-04-09T19:35:03+08:00" level=info msg="[pd] leader switches to: http://jira-cluster-pd-f56wf.jira-cluster-pd-peer.jira-tidb.svc:2379, previous: "
time="2018-04-09T19:35:03+08:00" level=info msg="[pd] init cluster id 6537979386539282199"
2018/04/09 19:35:03 server.go:126: [info] clusterID of pump server is 6537979386539282199
2018/04/09 19:35:03 binlogger.go:80: [info] create and lock binlog file data.pump/clusters/6537979386539282199/binlog-0000000000000000-20180409193503
2018/04/09 19:35:06 server.go:451: [info] generate fake binlog successfully

pump logs kafka connection info before version info, if kafka connection failed there'll be no version info from logs which may make debug more difficult.

NATS sink

I would like change events to be able to be published to NATS.
As I understand it Kafka is supported currently ?

NATS is written in golang.
NAtS and NATS Streaming are a different system btw. Only NATS streaming offers durable ACKS and should be the target.

correctness bug about drainer consume binlogs from pump

pump have a incorrect implement of hadFinished.

pump client in drainer only update current pos periodically while it meets complete binlogs (p+c binlog is already match, called complete binlog) except faker binlog/rollback binlog. in some case current poswould fall Behindend pos` of pump but drainer's pump client has finished.

Chinese garbled,tidb-binlog sync data to Kafka

1、tidb 2.1.4,kafka1.1.1

2、db and table character set is utf8mb4

3、chinese garbled as follow:

������ �¶����̕� ��������� ���������� ���̋����� ��� �2�2019-07-10 11:41:45 ������� �2�2019-07-10 11:41:45 �� �2�2019-07-10 11:41:45 �2�2019-07-10 11:41:45 ������ �¶����̕� ��������� ���������� ���̋����� ��� �2�2019-07-10 11:41:45 ������� �2�2019-07-10 11:41:45 �� �2�2019-07-10 11:41:45 �2�2019-07-10 11:41:45 ���
--

Unified command line arguments processing

drainer:

  • Doesn't support -v/--version
  • -h/--help is not in usage message
  • uses -L for log level

pump:

  • Doesn't support -v for version
  • -h/--help is not in help message
  • uses -debug for log level
  • -heartbeat-interval uint & -metrics-interval int, what's the different between uint & int

cistern:

  • Doesn't support -v for version
  • -h/--help is not in help message
  • uses -debug for log level

These tools should process command line arguments in an intuitive manner.

Validation of `-addr` and `-advertise-addr`

When given -addr :8250 without specifying -advertise-addr, both of these options end up being ":8250".

Validation should be added so that invalid address that can't be connected should fail the check.

something wrong about genInsertSQLs

恩,实际发现的问题是,对复合主键的表,mysql的genInsertSQLs生成出来的vals会少第二个主键列的数据。当然,tidb遇到符合主键的表应该是会自己生成隐藏列来做主键吧,那这样的话在接tidb时就没有上面的问题了。

若曦的问题

make the diff support all-databases and databases args flag like mysqldump

 -A, --all-databases Dump all the databases. This will be same as --databases
                      with all databases selected.
  -B, --databases     Dump several databases. Note the difference in usage; in
                      this case no tables are given. All name arguments are
                      regarded as database names. 'USE db_name;' will be
                      included in the output.

switch from juju/errors to pingcap/errors

I tried to do this switch, but what held me back is that tidb is vendored and still using juju/errors, including some of its esoteric APIs.
The latest version of tidb has switched to using pingcap/errors. Similarly, pd has switched to pkg/errors.
So I am hoping that the vendored tidb can be updated.

One of the main motivations for me to prompt tidb-binlog to do this change is that juju/errors is LGPL code, which makes proper distribution of tidb-binlog more difficult.

check whether the generated nodeID already exists in etcd.

problem: prevent same node id.
simple way: when node up/restart, read the node ID from local dir, when there is no local node id file, the node server should generate it, and check whether exist in etcd, then write to local node id file.

better way:....

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.