Giter VIP home page Giter VIP logo

mysql-binlog-connector-java's Introduction

mysql-binlog-connector-java Build Status Coverage Status Maven Central

MySQL Binary Log connector. @osheroff's fork of @shiyko's project, probably the "official" version of this. With help from the Debezium devs.

Usage

<dependency>
    <groupId>com.zendesk</groupId>
    <artifactId>mysql-binlog-connector-java</artifactId>
    <version>0.25.0</version>
</dependency>

Initially project was started as a fork of open-replicator, but ended up as a complete rewrite. Key differences/features:

  • automatic binlog filename/position | GTID resolution
  • resumable disconnects
  • plugable failover strategies
  • binlog_checksum=CRC32 support (for MySQL 5.6.2+ users)
  • secure communication over the TLS
  • JMX-friendly
  • real-time stats
  • availability in Maven Central
  • no third-party dependencies
  • test suite over different versions of MySQL releases

If you are looking for something similar in other languages - check out siddontang/go-mysql (Go), noplay/python-mysql-replication (Python).

Or get the latest JAR(s) from here.

Reading binary log file

File binlogFile = ...
EventDeserializer eventDeserializer = new EventDeserializer();
eventDeserializer.setCompatibilityMode(
    EventDeserializer.CompatibilityMode.DATE_AND_TIME_AS_LONG,
    EventDeserializer.CompatibilityMode.CHAR_AND_BINARY_AS_BYTE_ARRAY
);
BinaryLogFileReader reader = new BinaryLogFileReader(binlogFile, eventDeserializer);
try {
    for (Event event; (event = reader.readEvent()) != null; ) {
        ...
    }
} finally {
    reader.close();
}

Tapping into MySQL replication stream

PREREQUISITES: Whichever user you plan to use for the BinaryLogClient, he MUST have REPLICATION SLAVE privilege. Unless you specify binlogFilename/binlogPosition yourself (in which case automatic resolution won't kick in), you'll need REPLICATION CLIENT granted as well.

BinaryLogClient client = new BinaryLogClient("hostname", 3306, "username", "password");
EventDeserializer eventDeserializer = new EventDeserializer();
eventDeserializer.setCompatibilityMode(
    EventDeserializer.CompatibilityMode.DATE_AND_TIME_AS_LONG,
    EventDeserializer.CompatibilityMode.CHAR_AND_BINARY_AS_BYTE_ARRAY
);
client.setEventDeserializer(eventDeserializer);
client.registerEventListener(new EventListener() {

    @Override
    public void onEvent(Event event) {
        ...
    }
});
client.connect();

You can register a listener for onConnect / onCommunicationFailure / onEventDeserializationFailure / onDisconnect using client.registerLifecycleListener(...).

By default, BinaryLogClient starts from the current (at the time of connect) master binlog position. If you wish to kick off from a specific filename or position, use client.setBinlogFilename(filename) + client.setBinlogPosition(position).

client.connect() is blocking (meaning that client will listen for events in the current thread). client.connect(timeout), on the other hand, spawns a separate thread.

MariaDB

The stock BinaryLogClient works out of the box with MariaDB but there's two differences;

One, MariaDB's GTIDs are different. They're still strings but parse differently. Two, Maria can send the ANNOTATE_ROWS events which allows you to recover the SQL used to generate rows in row-based replication.

See https://mariadb.com/kb/en/annotate_rows_log_event/ and client.setUseSendAnnotateRowsEvent(true)

Controlling event deserialization

You might need it for several reasons: you don't want to waste time deserializing events you won't need; there is no EventDataDeserializer defined for the event type you are interested in (or there is but it contains a bug); you want certain type of events to be deserialized in a different way (perhaps *RowsEventData should contain table name and not id?); etc.

EventDeserializer eventDeserializer = new EventDeserializer();

// do not deserialize EXT_DELETE_ROWS event data, return it as a byte array
eventDeserializer.setEventDataDeserializer(EventType.EXT_DELETE_ROWS,
    new ByteArrayEventDataDeserializer());

// skip EXT_WRITE_ROWS event data altogether
eventDeserializer.setEventDataDeserializer(EventType.EXT_WRITE_ROWS,
    new NullEventDataDeserializer());

// use custom event data deserializer for EXT_DELETE_ROWS
eventDeserializer.setEventDataDeserializer(EventType.EXT_DELETE_ROWS,
    new EventDataDeserializer() {
        ...
    });

BinaryLogClient client = ...
client.setEventDeserializer(eventDeserializer);

Exposing BinaryLogClient through JMX

MBeanServer mBeanServer = ManagementFactory.getPlatformMBeanServer();

BinaryLogClient binaryLogClient = ...
ObjectName objectName = new ObjectName("mysql.binlog:type=BinaryLogClient");
mBeanServer.registerMBean(binaryLogClient, objectName);

// following bean accumulates various BinaryLogClient stats
// (e.g. number of disconnects, skipped events)
BinaryLogClientStatistics stats = new BinaryLogClientStatistics(binaryLogClient);
ObjectName statsObjectName = new ObjectName("mysql.binlog:type=BinaryLogClientStatistics");
mBeanServer.registerMBean(stats, statsObjectName);

Using SSL

Introduced in 0.4.0.

TLSv1.1 & TLSv1.2 require JDK 7+. Prior to MySQL 5.7.10, MySQL supported only TLSv1 (see Secure Connection Protocols and Ciphers).

To check that MySQL server is properly configured with SSL support - mysql -h host -u root -ptypeyourpasswordmaybe -e "show global variables like 'have_%ssl';" ("Value" should be "YES"). State of the current session can be determined using \s ("SSL" should not be blank).

System.setProperty("javax.net.ssl.trustStore", "/path/to/truststore.jks");
System.setProperty("javax.net.ssl.trustStorePassword","truststore.password");
System.setProperty("javax.net.ssl.keyStore", "/path/to/keystore.jks");
System.setProperty("javax.net.ssl.keyStorePassword", "keystore.password");

BinaryLogClient client = ...
client.setSSLMode(SSLMode.VERIFY_IDENTITY);

Implementation notes

  • data of numeric types (tinyint, etc) always returned signed(!) regardless of whether column definition includes "unsigned" keyword or not.
  • data of var*/*text/*blob types always returned as a byte array (for var* this is true starting from 1.0.0).

Frequently Asked Questions

Q. How does a typical transaction look like?

A. GTID event (if gtid_mode=ON) -> QUERY event with "BEGIN" as sql -> ... -> XID event | QUERY event with "COMMIT" or "ROLLBACK" as sql.

Q. EventData for inserted/updated/deleted rows has no information about table (except for some weird id). How do I make sense out of it?

A. Each WriteRowsEventData/UpdateRowsEventData/DeleteRowsEventData event is preceded by TableMapEventData which contains schema & table name. If for some reason you need to know column names (types, etc). - the easiest way is to

select TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, ORDINAL_POSITION, COLUMN_DEFAULT, IS_NULLABLE,
DATA_TYPE, CHARACTER_MAXIMUM_LENGTH, CHARACTER_OCTET_LENGTH, NUMERIC_PRECISION, NUMERIC_SCALE,
CHARACTER_SET_NAME, COLLATION_NAME from INFORMATION_SCHEMA.COLUMNS;
# see https://dev.mysql.com/doc/refman/5.6/en/columns-table.html for more information

(yes, binary log DOES NOT include that piece of information).

You can find JDBC snippet here.

Documentation

API overview

There are two entry points - BinaryLogClient (which you can use to read binary logs from a MySQL server) and BinaryLogFileReader (for offline log processing). Both of them rely on EventDeserializer to deserialize stream of events. Each Event consists of EventHeader (containing among other things reference to EventType) and EventData. The aforementioned EventDeserializer has one EventHeaderDeserializer (EventHeaderV4Deserializer by default) and a collection of EventDataDeserializer|s. If there is no EventDataDeserializer registered for some particular type of Event - default EventDataDeserializer kicks in (NullEventDataDeserializer).

MySQL Internals Manual

For the insight into the internals of MySQL look here. MySQL Client/Server Protocol and The Binary Log sections are particularly useful as a reference documentation for the **.binlog.network and **.binlog.event packages.

Real-world applications

Some of the OSS using / built on top of mysql-binlog-conector-java:

  • apache/nifi An easy to use, powerful, and reliable system to process and distribute data.
  • debezium A low latency data streaming platform for change data capture (CDC).
  • zendesk/maxwell A MySQL-to-JSON Kafka producer.
  • mavenlink/changestream - A stream of changes for MySQL built on Akka.
  • mardambey/mypipe MySQL binary log consumer with the ability to act on changed rows and publish changes to different systems with emphasis on Apache Kafka.
  • ngocdaothanh/mydit MySQL to MongoDB data replicator.
  • sharetribe/dumpr A Clojure library for live replicating data from a MySQL database.
  • shyiko/rook Generic Change Data Capture (CDC) toolkit.
  • streamsets/datacollector Continuous big data ingestion infrastructure.
  • twingly/ecco MySQL replication binlog parser in JRuby.
  • zzt93/syncer A tool sync & manipulate data from MySQL/MongoDB to ES/Kafka/MySQL, which make 'Eventual Consistency' promise.

It's also used on a large scale in MailChimp. You can read about it here.

Development

git clone https://github.com/shyiko/mysql-binlog-connector-java.git
cd mysql-binlog-connector-java
mvn # shows how to build, test, etc. project

Deployment

setup your settings.xml to have a "central" entry.

mvn deploy

Contributing

In lieu of a formal styleguide, please take care to maintain the existing coding style. Executing mvn checkstyle:check within project directory should not produce any errors. If you are willing to install vagrant (required by integration tests) it's highly recommended to check (with mvn clean verify) that there are no test failures before sending a pull request. Additional tests for any new or changed functionality are also very welcomed.

License

Apache License, Version 2.0

mysql-binlog-connector-java's People

Contributors

acristu avatar adsr avatar ahmedahamid avatar auntyellow avatar burnison avatar chenyun avatar crispmark avatar elw00d avatar fivetran-mattalexander avatar georgewfraser avatar glarwood avatar gunnarmorling avatar janickr avatar jolivares avatar jpechane avatar methodmissing avatar mey-paytm avatar morozov avatar naros avatar osheroff avatar pprasse avatar redcape avatar rtreffer avatar sean-k1 avatar shyiko avatar stevenczp avatar ucaninfosec avatar wingerx avatar wjlow avatar zzt93 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mysql-binlog-connector-java's Issues

Is there any performance test report

As the title describes, I want to know something about the performance of mysql-binlog-connector-java.
And I would be very grateful if you have any relevant information.

pom.xml Java 8 Compability

Problem:

Currenly java pom.xml defines java version compability like this:

<configuration>
<release>8</release>
</configuration>

But this doesn't work with with java 8

Tested with In macOS Big Sur 11.4 using openjdk 1.8

openjdk version "1.8.0_282"
OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_282-b08)
OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.282-b08, mixed mode)

Gives error

INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:testCompile (default-testCompile) on project mysql-binlog-connector-java: Fatal error compiling: invalid flag: --release -> [Help 1]

Solution

But classical way define this works fine with both jdk 8 and 11

pom.xml:

      <configuration>
           <source>1.8</source>
           <target>1.8</target>
      </configuration>

So I would prefer to use source+target syntax in pom.xml.

Note about compatibility

Note I would like clearly defined current version compatibility.
Actually code has only few things that makes it incompatible with Java 1.6.

Unsigned integer opt

I am a developer of data sync tool and I am trying to simplify the config that a user of sync tool needed to do. For example, if user need to backup a database using binlog method, it may use our tool. User has to do some conversions during the sync process, because of the unsigned integer.

CREATE TABLE IF NOT EXISTS  `simple_type` (
	`id` bigint UNSIGNED NOT NULL AUTO_INCREMENT,
	`tinyint` tinyint UNSIGNED NOT NULL,
        ...,
	PRIMARY KEY (`id`)
)

Because the connector default read (unsigned) tinyint, (unsigned) smallint etc to int,

    protected Serializable deserializeTiny(ByteArrayInputStream inputStream) throws IOException {
        return (int) ((byte) inputStream.readInteger(1));
    }

    protected Serializable deserializeShort(ByteArrayInputStream inputStream) throws IOException {
        return (int) ((short) inputStream.readInteger(2));
    }

it seems that sync tool has no way to know what exactly the type it represents, or even the size. I am little bit confused whether the int cast can be removed, in which case, we can convert it to hex string according to byte count when output to MySQL (Of course, failed again in int24). Or even better, add an option to convert integer to byte array like text/binary.

So is there any problem with this change and do you have plan to do so?

java.io.EOFException when reading large number of UPDATE events

We have had a setup using this library for a while now with no issues. However we recently found a deserialization issue when attempting to read a large number of UPDATE events under one transaction from the binlog.

Setup:
We have a DB with a column for each MySQL column type (33 columns).
When performing an update across a large number of these rows (1000+) we begin seeing the below exception.

Sample Update to update all rows:
UPDATE test_table SET smallint_column = 10;

Exception:

com.github.shyiko.mysql.binlog.event.deserialization.EventDataDeserializationException: Failed to deserialize data of EventHeaderV4{timestamp=1671468952000, eventType=EXT_UPDATE_ROWS, serverId=190225211, headerLength=19, dataLength=83
87, nextPosition=21497348, flags=0}
        at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.deserializeEventData(EventDeserializer.java:335) ~[mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.nextEvent(EventDeserializer.java:238) ~[mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:949) ~[mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:599) ~[mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:854) ~[mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at java.lang.Thread.run(Thread.java:834) ~[?:?]
Caused by: java.io.EOFException: Failed to read remaining 10476 of 14390 bytes from position 5409623. Block length: 0. Initial block length: 8383.
        at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.fill(ByteArrayInputStream.java:115) ~[mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.read(ByteArrayInputStream.java:105) ~[mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at com.github.shyiko.mysql.binlog.event.deserialization.AbstractRowsEventDataDeserializer.deserializeBlob(AbstractRowsEventDataDeserializer.java:403) ~[mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at com.github.shyiko.mysql.binlog.event.deserialization.AbstractRowsEventDataDeserializer.deserializeCell(AbstractRowsEventDataDeserializer.java:199) ~[mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at com.github.shyiko.mysql.binlog.event.deserialization.AbstractRowsEventDataDeserializer.deserializeRow(AbstractRowsEventDataDeserializer.java:143) ~[mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at com.github.shyiko.mysql.binlog.event.deserialization.UpdateRowsEventDataDeserializer.deserializeRows(UpdateRowsEventDataDeserializer.java:71) ~[mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at com.github.shyiko.mysql.binlog.event.deserialization.UpdateRowsEventDataDeserializer.deserialize(UpdateRowsEventDataDeserializer.java:58) ~[mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at com.github.shyiko.mysql.binlog.event.deserialization.UpdateRowsEventDataDeserializer.deserialize(UpdateRowsEventDataDeserializer.java:33) ~[mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.deserializeEventData(EventDeserializer.java:329) ~[mysql-binlog-connector-java-0.26.1.jar:0.26.1]
        ... 5 more

Additional Observations:

  • When updating >1000 rows the exception happens consistently.
  • When updating ~100 rows the exception happens inconsistently. Sometimes it happens, sometimes on restart/reconnection it passes.
  • Our event listener successfully receives the GTID event, QUERY event, TABLE_MAP event, followed by sometimes the first few EXT_UPDATE_ROWS events associated with this transaction before failure.
  • It is not consistent in failing inside of deserializeBlob. I have tried removing this column all together and have seen the same issue coming from deserializeJson and deserializeNewDecimal.

The difference I have noticed so far is that in the failed events the len param in readWithinBlockBoundaries(...) exceeds the blockLength. So all bytes are read that are left in the blockLength but any remaining bytes len - blockLength are not, and fail with the corresponding offsets printed in the exception above.

Binlog library version: 0.26.1
MySQL version: 8.0.28

JSON binary document is larger than json document for jsonarray parse for JSON_ARRAY sql update statement

@osheroff thanks for looking into this issue, i am moving this issue here from shyiko#328

I got the error when i use kafka connect io debizium for mysql. the sql statement i guess ran is below

io.debezium mysql version : 1.2.0
mysql: >= 8.xx

sample sql:

UPDATE tesla.tasks
SET Configuration = JSON_ARRAY_APPEND(Configuration, '$.jobCodes', "00LS7513")
WHERE task_id=3

44-2d52-11e8-8e14-0050569a3d92:1-1513750463,81604171-2e5e-11e8-b151-0050569a18e0:1-2 (sid:9000003, cid:53244735)
ata value for 'security.securitygroup.Configuration' of type JSON for row [[66, 82, 95, 83, 86, 67, 95, 65, 68, 86, 73, 83, 79, 82], [66, 111, 100, 121, 32, 82,
parse and read a JSON value on Configuration JSON NOT NULL: The offset for the value in the JSON binary document is 14092, which is larger than the binary form
io.debezium.connector.mysql.MySqlValueConverters.lambda$convertJson$17(MySqlValueConverters.java:358)
io.debezium.jdbc.JdbcValueConverters.convertValue(JdbcValueConverters.java:1204)
io.debezium.connector.mysql.MySqlValueConverters.convertJson(MySqlValueConverters.java:346)
io.debezium.connector.mysql.MySqlValueConverters.lambda$converter$1(MySqlValueConverters.java:224)
io.debezium.relational.TableSchemaBuilder.lambda$createValueGenerator$5(TableSchemaBuilder.java:256)
io.debezium.relational.TableSchema.valueFromColumnData(TableSchema.java:143)
io.debezium.connector.mysql.RecordMakers$1.update(RecordMakers.java:269)
io.debezium.connector.mysql.RecordMakers$RecordsForTable.update(RecordMakers.java:499)
io.debezium.connector.mysql.BinlogReader.handleUpdate(BinlogReader.java:899)
io.debezium.connector.mysql.BinlogReader.handleEvent(BinlogReader.java:507)
com.github.shyiko.mysql.binlog.BinaryLogClient.notifyEventListeners(BinaryLogClient.java:1095)
com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:943)
com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:580)
com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:825)

the JSON binary document is 14092, which is larger than the binary form of the JSON document (11788 bytes)
com.github.shyiko.mysql.binlog.event.deserialization.json.JsonBinary.parseArray(JsonBinary.java:503)
com.github.shyiko.mysql.binlog.event.deserialization.json.JsonBinary.parse(JsonBinary.java:217)
com.github.shyiko.mysql.binlog.event.deserialization.json.JsonBinary.parseObject(JsonBinary.java:400)
com.github.shyiko.mysql.binlog.event.deserialization.json.JsonBinary.parse(JsonBinary.java:211)
com.github.shyiko.mysql.binlog.event.deserialization.json.JsonBinary.parse(JsonBinary.java:205)
com.github.shyiko.mysql.binlog.event.deserialization.json.JsonBinary.parse(JsonBinary.java:181)
com.github.shyiko.mysql.binlog.event.deserialization.json.JsonBinary.parseAsString(JsonBinary.java:168)
lambda$convertJson$17(MySqlValueConverters.java:356)

ng. Last offset stored = null, binlog reader near position = mysql-bin.001699/243807331 (io.debezium.connector.mysql.BinlogReader)
rocessing binlog event (io.debezium.connector.mysql.BinlogReader)
lue: null used for required field: "Configuration", schema type: STRING

TableMapEventMetadataDeserializer.deserialize NullPointerException

I ran into a problem that when trying to deserialize the table map event, the MetadataFieldType is null so getting the NPE as a result (I do not use the latest version of the binlog connector): https://github.com/osheroff/mysql-binlog-connector-java/blob/master/src/main/java/com/github/shyiko/mysql/binlog/event/deserialization/TableMapEventMetadataDeserializer.java#L50

I debugged it and noticed that the code for the metadata field type I got from the binlog table map event is 128, and I don't think mysql supports that many metadata filed types. Does anyone know what might be the cause of this issue?

Debezium Connectors Throwing EOF / Connection reset while reading bin-logs

Hi Can you Please help up with an Issue we are facing when we are reading bin-log suddenly during high traffic we get EOF Exception/ Connection Reset Exception
This issue we got in mysql-binlog-connector-java 0.19.0 but i noticed that repositor is not maintained anymore, so asking here.

Read many blogs and saw the issue was caused due to Server ID but in debezium Config we have set database.server.id different for all the MySQL connectors.

Architecture : We have one server with 7 schema and we are using seven MYSQL Connectors to read the servers bin-log, with debezium whitelist property so that each connector will pass data on concerning to it's schema.

In MySQL logs we got this : 2020-09-03T09:34:35.745673Z 87255342 [Note] Aborted connection 87255342 to db: 'unconnected' user: 'kafka_source_user' host: '172.X.X.X' (Failed on my_net_write())

[2020-08-26 04:41:25,967] ERROR Error while deserializing binlog event at offset {ts_sec=1598429128, file=mysql-bin-changelog.179047, pos=120108368, server_id=284246538, event=394}.
Use the mysqlbinlog tool to view the problematic event: mysqlbinlog --start-position=120174225 --stop-position=120174457 --verbose mysql-bin-changelog.179047 (io.debezium.connector.mysql.BinlogReader)
[2020-08-26 04:41:25,967] ERROR Error during binlog processing. Last offset stored = {ts_sec=1598429125, file=mysql-bin-changelog.179047, pos=114769194, row=1, server_id=284246538, event=522}, binlog reader near position = mysql-bin-changelog.179047/120174122 (io.debezium.connector.mysql.BinlogReader)
[2020-08-26 04:41:25,967] ERROR Failed due to error: Error processing binlog event (io.debezium.connector.mysql.BinlogReader)
org.apache.kafka.connect.errors.ConnectException: com.github.shyiko.mysql.binlog.event.deserialization.EventDataDeserializationException: Failed to deserialize data of EventHeaderV4{timestamp=1598429128000, eventType=EXT_WRITE_ROWS, serverId=284246538, headerLength=19, dataLength=213, nextPosition=120174457, flags=0}
at io.debezium.connector.mysql.AbstractReader.wrap(AbstractReader.java:230)
at io.debezium.connector.mysql.AbstractReader.failed(AbstractReader.java:208)
at io.debezium.connector.mysql.BinlogReader.handleEvent(BinlogReader.java:508)
at com.github.shyiko.mysql.binlog.BinaryLogClient.notifyEventListeners(BinaryLogClient.java:1095)
at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:943)
at com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:580)
at com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:825)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: com.github.shyiko.mysql.binlog.event.deserialization.EventDataDeserializationException: Failed to deserialize data of EventHeaderV4{timestamp=1598429128000, eventType=EXT_WRITE_ROWS, serverId=284246538, headerLength=19, dataLength=213, nextPosition=120174457, flags=0}
at io.debezium.connector.mysql.BinlogReader.handleServerIncident(BinlogReader.java:575)
at io.debezium.connector.mysql.BinlogReader.handleEvent(BinlogReader.java:492)
... 5 more
Caused by: com.github.shyiko.mysql.binlog.event.deserialization.EventDataDeserializationException: Failed to deserialize data of EventHeaderV4{timestamp=1598429128000, eventType=EXT_WRITE_ROWS, serverId=284246538, headerLength=19, dataLength=213, nextPosition=120174457, flags=0}
at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.deserializeEventData(EventDeserializer.java:300)
at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.nextEvent(EventDeserializer.java:223)
at io.debezium.connector.mysql.BinlogReader$1.nextEvent(BinlogReader.java:224)
at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:922)
... 3 more
Caused by: java.io.EOFException
at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.read(ByteArrayInputStream.java:190)
at java.io.InputStream.read(InputStream.java:170)
at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.fill(ByteArrayInputStream.java:96)
at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.read(ByteArrayInputStream.java:89)
at io.debezium.connector.mysql.RowDeserializers.deserializeDatetimeV2(RowDeserializers.java:407)
at io.debezium.connector.mysql.RowDeserializers$WriteRowsDeserializer.deserializeDatetimeV2(RowDeserializers.java:207)
at com.github.shyiko.mysql.binlog.event.deserialization.AbstractRowsEventDataDeserializer.deserializeCell(AbstractRowsEventDataDeserializer.java:178)
at com.github.shyiko.mysql.binlog.event.deserialization.AbstractRowsEventDataDeserializer.deserializeRow(AbstractRowsEventDataDeserializer.java:138)
at com.github.shyiko.mysql.binlog.event.deserialization.WriteRowsEventDataDeserializer.deserializeRows(WriteRowsEventDataDeserializer.java:64)
at com.github.shyiko.mysql.binlog.event.deserialization.WriteRowsEventDataDeserializer.deserialize(WriteRowsEventDataDeserializer.java:56)
at com.github.shyiko.mysql.binlog.event.deserialization.WriteRowsEventDataDeserializer.deserialize(WriteRowsEventDataDeserializer.java:32)
at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.deserializeEventData(EventDeserializer.java:294)
... 6 more
[2020-08-26 04:41:25,968] ERROR Error during binlog processing. Last offset stored = {ts_sec=1598429125, file=mysql-bin-changelog.179047, pos=114769194, row=1, server_id=284246538, event=522}, binlog reader near position = mysql-bin-changelog.179047/120174457 (io.debezium.connector.mysql.BinlogReader)

ArrayIndexOutOfBoundsException in BufferedSocketInputStream.java:65

Hi @osheroff again

We got another exception in mysql-binlog-connector-java while using Debezium recently๐Ÿ˜… This is the full error log we get.

org.apache.kafka.connect.errors.ConnectException: An exception occurred in the change event producer. This connector will be stopped.
    at io.debezium.pipeline.ErrorHandler.setProducerThrowable(ErrorHandler.java:42)
    at io.debezium.connector.mysql.MySqlStreamingChangeEventSource$ReaderThreadLifecycleListener.onEventDeserializationFailure(MySqlStreamingChangeEventSource.java:1193)
    at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:958)
    at com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:606)
    at com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:850)
    at java.lang.Thread.run(Thread.java:748)
Caused by: io.debezium.DebeziumException
    at io.debezium.connector.mysql.MySqlStreamingChangeEventSource.wrap(MySqlStreamingChangeEventSource.java:1146)
    ... 5 more
Caused by: java.lang.ArrayIndexOutOfBoundsException
    at com.github.shyiko.mysql.binlog.io.BufferedSocketInputStream.read(BufferedSocketInputStream.java:65)
    at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.readWithinBlockBoundaries(ByteArrayInputStream.java:262)
    at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.read(ByteArrayInputStream.java:241)
    at java.io.InputStream.skip(InputStream.java:224)
    at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.skipToTheEndOfTheBlock(ByteArrayInputStream.java:280)
    at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.deserializeEventData(EventDeserializer.java:305)
    at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.nextEvent(EventDeserializer.java:232)
    at io.debezium.connector.mysql.MySqlStreamingChangeEventSource$1.nextEvent(MySqlStreamingChangeEventSource.java:233)
    at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:945)
    ... 3 more

From the error log, it shows java.lang.ArrayIndexOutOfBoundsException in code

System.arraycopy(buffer, offset, b, off, bytesRemainingInBuffer);

After digging more about the code, I believe the issue is caused by the value of limit in line

According to the doc of InputStream, read(...) method will return -1 if it reaches the end of stream, so the limit could be -1 here. Then bytesRemainingInBuffer will be a negative value which causes java.lang.ArrayIndexOutOfBoundsException in System.arraycopy method.

I wrote a test case for reproduce this error in BufferedSocketInputStreamTest

    @Test
    public void testReadToTheEnd() throws Exception {
        BufferedSocketInputStream in = new BufferedSocketInputStream(new ByteArrayInputStream(new byte[]{
            0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}), 20);
        assertEquals(in.read(), 0);
        assertEquals(in.read(), 1);
        byte[] buf = new byte[6];
        assertEquals(in.read(buf, 0, buf.length), 6);
        assertEquals(in.read(buf, 0, buf.length), 6);
        assertEquals(in.read(buf, 0, buf.length), 3); // only 3 bytes left in stream
        assertEquals(in.read(buf, 0, buf.length), -1); // got the end of stream, so return -1
    }

This test case will throw the same java.lang.ArrayIndexOutOfBoundsException as I mentioned above, and I have a patch for fixing this issue, could you help to review it? Thank you a lot.

No break keyword in the switch-case code block.

version: 0.25.3
at:
com/github/shyiko/mysql/binlog/event/deserialization/TableMapEventMetadataDeserializer.java:92

image
In this switch-case code block,the case ENUM_AND_SET_COLUMN_CHARSET: statement does not end with the break keyword.
Is this a bug, or are there other concerns?

JSON partial update can break reading of field names

For MySQL 8 it is possible when partial update is executed that the field names are not follow one after other but is sparsed and filled with garbage in between.

This means it is necessary to use key reading based on offsets too in the same way as for values - #64

Test failure with MySQL 8.0

The test com.github.shyiko.mysql.binlog.BinaryLogClientIntegrationTest.testDeserializationOfYEAR() fails with MySQL 8.0:

INFO: Received Event{header=EventHeaderV4{timestamp=1626463811000, eventType=QUERY, serverId=1, headerLength=19, dataLength=56, nextPosition=3236, flags=8}, data=QueryEventData{threadId=10, executionTime=0, errorCode=0, database='mbcj_test', sql='BEGIN'}}
Jul 16, 2021 7:30:11 PM com.github.shyiko.mysql.binlog.TraceLifecycleListener onEventDeserializationFailure
SEVERE: Event deserialization failed
com.github.shyiko.mysql.binlog.event.deserialization.EventDataDeserializationException: Failed to deserialize data of EventHeaderV4{timestamp=1626463811000, eventType=TABLE_MAP, serverId=1, headerLength=19, dataLength=59, nextPosition=3314, flags=0}
	at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.deserializeEventData(EventDeserializer.java:309)
	at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.deserializeTableMapEventData(EventDeserializer.java:281)
	at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.nextEvent(EventDeserializer.java:228)
	at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:952)
	at com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:599)
	at com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:857)
	at java.base/java.lang.Thread.run(Thread.java:832)
Caused by: java.io.IOException: Unsupported table metadata field type: 128
	at com.github.shyiko.mysql.binlog.event.deserialization.TableMapEventMetadataDeserializer.deserialize(TableMapEventMetadataDeserializer.java:47)
	at com.github.shyiko.mysql.binlog.event.deserialization.TableMapEventDataDeserializer.deserialize(TableMapEventDataDeserializer.java:47)
	at com.github.shyiko.mysql.binlog.event.deserialization.TableMapEventDataDeserializer.deserialize(TableMapEventDataDeserializer.java:1)
	at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.deserializeEventData(EventDeserializer.java:303)
	... 6 more

No idea where this metadata type 128 is coming from, I don't see it documented anywhere.

Which makes me think: perhaps we should handle unexpected metadata types more gracefully, rather logging a warning instead of throwing an exception? Adhering to "Be conservative with what you send and relaxed with what you receive", we might simply ignore this case. WDYT, @osheroff!

Can't capture the binlog data of update with 4 tables join

HI,
I'm working on CaptureChangeMySQL processor with nifi version 1.11.4. I just capture the update event of one table when I capture update for 4 tables join.
Below is my SQL.

  UPDATE sub_orders_1001_1 s
        inner join main_order_1001 m on m.main_order_id=s.main_order_id
        left join sub_order_mc_1001 ms on ms.sub_order_id=s.sub_order_id
        left join order_promotion_info_1001 p on p.sub_order_id=s.sub_order_id
       set
          s.merchant_id='1' 
          , s.merchant_store_id=100
            , s.store_sales_type=m.store_sales_type
            , s.store_channel_id=100
            , s.presell_type=m.order_type
            , ms.store_sales_type=m.store_sales_type
            , ms.order_type=m.order_type
            , p.store_sales_type=m.store_sales_type
            , p.presell_type=m.order_type
          where  s.id between 1 and 100 and s.source=1 and s.merchant_id=''

I would like to capture the binlog change of mysql for this sql.

But the wrong behavior is I just capture the binlog change of table order_promotion_info_1001. Cannot capture the change of sub_orders_1001_1.
Where is this mistake? For binlog I'm new, but, Isn't it a row by row?
Is this a nifi or a mysql-binlog-connector-java issue?

/**
 * Describes inserted rows (within a single table).
 * Used in case of RBR (5.1.18+).
 */
EXT_WRITE_ROWS, EXT_UPDATE_ROWS๏ผŒ EXT_DELETE_ROWS

MySQL version is 5.7.20, Is this above events will be handle๏ผŸ what is the mean (within a single table)?

Could you help me to point out the ways how to capture all events for all tables of 4 tables joins?

Thanks!

Paul Yang

Can't read from binlog of mysql hosted by azure when ssl enabled

We are trying to use debezium read changes from a mysql instance hosted on Azure.

We have realized that when ssl is enabled, we get an authentication error while trying to read binlog. Sample error stack trace is as below:

connect_1  |    at io.debezium.connector.mysql.BinlogReader.doStart(BinlogReader.java:445)
connect_1  |    at io.debezium.connector.mysql.AbstractReader.start(AbstractReader.java:116)
connect_1  |    at io.debezium.connector.mysql.ChainedReader.startNextReader(ChainedReader.java:206)
connect_1  |    at io.debezium.connector.mysql.ChainedReader.readerCompletedPolling(ChainedReader.java:158)
connect_1  |    at io.debezium.connector.mysql.AbstractReader.cleanupResources(AbstractReader.java:309)
connect_1  |    at io.debezium.connector.mysql.AbstractReader.poll(AbstractReader.java:288)
connect_1  |    at io.debezium.connector.mysql.ChainedReader.poll(ChainedReader.java:146)
connect_1  |    at io.debezium.connector.mysql.MySqlConnectorTask.doPoll(MySqlConnectorTask.java:443)
connect_1  |    at io.debezium.connector.common.BaseSourceTask.poll(BaseSourceTask.java:131)
connect_1  |    at org.apache.kafka.connect.runtime.WorkerSourceTask.poll(WorkerSourceTask.java:270)
connect_1  |    at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:237)
connect_1  |    at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:184)
connect_1  |    at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:234)
connect_1  |    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
connect_1  |    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
connect_1  |    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
connect_1  |    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
connect_1  |    at java.base/java.lang.Thread.run(Thread.java:834)
connect_1  | Caused by: java.io.EOFException
connect_1  |    at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.read(ByteArrayInputStream.java:190)
connect_1  |    at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.readInteger(ByteArrayInputStream.java:46)
connect_1  |    at com.github.shyiko.mysql.binlog.network.protocol.PacketChannel.read(PacketChannel.java:59)
connect_1  |    at com.github.shyiko.mysql.binlog.BinaryLogClient.authenticate(BinaryLogClient.java:722)
connect_1  |    at com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:515)
connect_1  |    at com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:860)
connect_1  |    ... 1 more

This is most probably related to not passing authority certificate issue. Nevertheless since we don't get any authorization exception during snapshot phase, this error becomes very confusing for us. Is there any option to by pass this error without passing certificate file?

Memory leak in EventDeserializer caused by tableMapEventByTableId

Hi @osheroff

There is a long ago memory leak issue in EventDeserializer caused by tableMapEventByTableId. I saw you fixed it by using a bounded map which has size limit of 10000

code here:

It can avoid the huge memory usage when there are large amount of tables in one MySQL instance, but it still could take a lot of memory if this map is almost full. We got this issue in Debezium.

memory_leak

However, according to this post: https://dba.stackexchange.com/questions/51873/replication-binary-log-parsingtableid-generation-on-delete-cascade-handling
(a better investigation post but in Chinese: http://blog.chinaunix.net/uid-26896862-id-3329896.html)

Table id is not a constant value in one MySQL instance, it will assign a new id when there is no cached table id found, or roll to a new id when the cache in MySQL server my_hash_search_using_hash_value is full. Thus, it is not necessary keeping this tableMapEventByTableId map all the time.

I have a fix for this issue by just cleaning this map regularly after receiving a MySQL binlog ROTATE event (definition here: https://dev.mysql.com/doc/internals/en/event-meanings.html)

Could not parse the binlog of Mysql >= 8.0.23

Question on multi sources

Hello,

MySQL support multiple source db to one slave by using some channel to replicate data, is this supported by binlog extraction ? Or maybe we could use one binlogClient for each source ?

Thank you.

memory leak into WriteRowsEventDataDeserializer.deserializeRows

Env: We are using StreamSets to ingest data from mysql 5.7's binlog to kafka ,it uses mysql-binlog-connector-java version 0.23.3 , and Streamsets set batch process, each batch is 300. and mysql database charset is utf-8.
Issue: every 3-5days, it occurred memory leak.
Analyze: using MAT to analyze,I found that WriteRowsEventDataDeserializer.deserializeRows 's local var List<Serializable[]> has more than 2 millions objects ,total size were more than 7GB ,so FullGC happened, but these are live objects, cannot revoke memory.
When the issue occurred , the ByteArrayInputStream size is 524288 bytes,but below code did 2 millions loops .
while (inputStream.available() > 0) {
result.add(deserializeRow(tableId, includedColumns, inputStream));
}

See below MAT pictures:
image
image
image

meanwhile , I found ByteArrayInputStream content has Garbled characters , some columns of data rows has Chinese charactersใ€‚
ByteArrayInputStream content is below:
when open it as charset ANSI:
image

when open it as charset UTF-8:
image

Question: does this issue caused by Chinese character in the InputStream ? how to fix this issue?

MariaDb binlog compression

Hi,

Trying mysql-binlog-connector 0.21.0 against mariadb. Whenever binlog compression is enabled, it throws java.io.IOException: Unknown event type.

Is binlog compression supported by mysql-binlog-connector ?

Following is how one of the events look (retrieved using mysqlbinlog):

Event{
header=EventHeaderV4{
timestamp=1592003319000,
eventType=QUERY,
serverId=5,
headerLength=19,
dataLength=280,
nextPosition=753402908,
flags=8
},
data=QueryEventData{
threadId=0,
executionTime=0,
errorCode=0,
database='',
sql='# Dummy event replacing event type 160 that slave cannot handle. '
}
}

Azure connection reading binlog fails with 'The connection string may not be right. Please visit portal for references'

Trying to use debezium to read the binlog, but on Azure a MySQL 5.7 instance, am getting "The connection string may not be right. Please visit portal for references."

The debezium part of the connection works using the regular MySQL JDBC driver, but when connecting for the binlog it fails. Any insight into what to do or data you'd like me to provide, please let me know.

Tried to pull the latest and still didn't help:
com.zendesk:mysql-binlog-connector-java:0.23.3
mysql:mysql-connector-java:8.0.22
io.debezium:debezium-connector-mysql:1.4.0.Final
io.debezium:debezium-embedded:1.4.0.Final

Stack Trace:

2021-01-14 17:51:41,154 ERROR i.d.e.EmbeddedEngine [pool-2-thread-1] Unable to initialize and start connector's task class 'io.debezium.connector.mysql.MySqlConnectorTask' with config: {snapshot.mode=never, database.logger=com.mysql.cj.log.Slf4JLogger, database.user=xxx@zzz, database.password=********, offset.storage=org.apache.kafka.connect.storage.FileOffsetBackingStore, database.server.name=me-debezium-testname, database.ssl.mode=required, database.serverTimezone=UTC, snapshot.locking.mode=NONE, offset.storage.file.filename=debeziumOffsetFile, connector.class=io.debezium.connector.mysql.MySqlConnector, database.port=3306, database.history.file.filename=dbhistory.dat, database.hostname=zzz.mysql.database.azure.com, bigint.unsigned.handling.mode=precise, table.include.list=^vvv.*\.vvv$, database.server.id=96406, database.history=io.debezium.relational.history.FileDatabaseHistory, name=me-debezium-test} org.apache.kafka.connect.errors.ConnectException: Failed to authenticate to the MySQL database at zzz.mysql.database.azure.com:3306 with user 'xxx@zzz'
	at io.debezium.connector.mysql.BinlogReader.doStart(BinlogReader.java:444)
	at io.debezium.connector.mysql.AbstractReader.start(AbstractReader.java:127)
	at io.debezium.connector.mysql.ChainedReader.startNextReader(ChainedReader.java:206)
	at io.debezium.connector.mysql.ChainedReader.start(ChainedReader.java:103)
	at io.debezium.connector.mysql.MySqlConnectorTask.start(MySqlConnectorTask.java:272)
	at io.debezium.connector.common.BaseSourceTask.start(BaseSourceTask.java:106)
	at io.debezium.embedded.EmbeddedEngine.run(EmbeddedEngine.java:758)
	at io.debezium.embedded.ConvertingEngineBuilder$2.run(ConvertingEngineBuilder.java:171)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)
Caused by: com.github.shyiko.mysql.binlog.network.AuthenticationException: The connection string may not be right. Please visit portal for references.
	at com.github.shyiko.mysql.binlog.network.Authenticator.readResult(Authenticator.java:85)
	at com.github.shyiko.mysql.binlog.network.Authenticator.authenticate(Authenticator.java:70)
	at com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:526)
	at com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:839)
	... 1 more

Here's a better unit test of the whole issue showing only MySQL JDBC and mysql-binlog-connector-java:

        System.out.println("Test MySQL connection:");
        String connJdcString = "jdbc:mysql://" + dbHost + "?verifyServerCertificate=false&useSSL=true&serverTimezone=UTC";
        try (Connection conn = DriverManager.getConnection(connJdcString, dbUser, dbPass);
             Statement stmt = conn.createStatement();
             ResultSet rs = stmt.executeQuery("SHOW MASTER STATUS")) {
            while (rs.next()) {
                System.out.println("  Log file: " + rs.getString(1));
            }
        }
        System.out.println("Test MySQL binary log:");
        BinaryLogClient client = new BinaryLogClient(
                dbHost,
                3306,
                dbUser,
                dbPass);
        client.setSSLMode(SSLMode.REQUIRED);
        EventDeserializer eventDeserializer = new EventDeserializer();
        eventDeserializer.setCompatibilityMode(
                EventDeserializer.CompatibilityMode.DATE_AND_TIME_AS_LONG,
                EventDeserializer.CompatibilityMode.CHAR_AND_BINARY_AS_BYTE_ARRAY
        );
        client.setEventDeserializer(eventDeserializer);
        client.registerEventListener(new BinaryLogClient.EventListener() {

            @Override
            public void onEvent(Event event) {
                System.out.println(event);
            }
        });
        client.connect();

Output is:

Test MySQL connection:
  Log file: mysql-bin.000001
Test MySQL binary log:
Jan 14, 2021 6:45:42 PM com.github.shyiko.mysql.binlog.BinaryLogClient tryUpgradeToSSL
INFO: SSL enabled
Exception in thread "main" com.github.shyiko.mysql.binlog.network.AuthenticationException: The connection string may not be right. Please visit portal for references.๏ฟฝ
	at com.github.shyiko.mysql.binlog.network.Authenticator.readResult(Authenticator.java:85)
	at com.github.shyiko.mysql.binlog.network.Authenticator.authenticate(Authenticator.java:70)
	at com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:526)
	at com.example.debeziumtest.MainTest.main(MainTest.java:46)

Note that without serverTimeZone=UTC, I get: the following as an error during connection with the first JDBC connection.

The server time zone value '' is unrecognized or represents more than one time zone. You must configure either the server or JDBC driver (via the 'serverTimezone' configuration property) to use a more specific time zone value if you want to utilize time zone support.

The connector does not work with Proxysql >= 2.1.0

Hi,

I've upgraded Proxysql from 2.0.18 to 2.1.0 and the connector stopped working. It works well when I downgrade Proxysql back to 2.0.18

mysql_connector_1           | Jul 02, 2021 10:00:53 AM com.github.shyiko.mysql.binlog.BinaryLogClient$2 run
mysql_connector_1           | WARNING: Failed to establish connection in 29998ms. Forcing disconnect.

mysql_connector_1           | 	at io.debezium.connector.mysql.BinlogReader.doStart(BinlogReader.java:435)
mysql_connector_1           | 	at io.debezium.connector.mysql.AbstractReader.start(AbstractReader.java:116)
mysql_connector_1           | 	at io.debezium.connector.mysql.ChainedReader.startNextReader(ChainedReader.java:206)
mysql_connector_1           | 	at io.debezium.connector.mysql.ChainedReader.start(ChainedReader.java:103)
mysql_connector_1           | 	at io.debezium.connector.mysql.MySqlConnectorTask.start(MySqlConnectorTask.java:279)
mysql_connector_1           | 	at io.debezium.connector.common.BaseSourceTask.start(BaseSourceTask.java:106)
mysql_connector_1           | 	at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:232)
mysql_connector_1           | 	at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:185)
mysql_connector_1           | 	at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:235)
mysql_connector_1           | 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
mysql_connector_1           | 	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
mysql_connector_1           | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
mysql_connector_1           | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
mysql_connector_1           | 	at java.base/java.lang.Thread.run(Thread.java:834)
mysql_connector_1           | Caused by: java.util.concurrent.TimeoutException: BinaryLogClient was unable to connect in 30000ms
mysql_connector_1           | 	at com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:862)
mysql_connector_1           | 	at io.debezium.connector.mysql.BinlogReader.doStart(BinlogReader.java:407)
mysql_connector_1           | 	... 13 more
mysql_connector_1           | 2021-07-02 10:00:53,260 ERROR  ||  WorkerSourceTask{id=my-mysql-connector-0} Task is being killed and will not recover until manually restarted   [org.apache.kafka.connect.runtime.WorkerTask]

Here is the logs from Proxysql container:

proxysql1_1                 | 2021-07-02 10:00:53 MySQL_Thread.cpp:4838:process_all_sessions(): [WARNING] Closing unhealthy client connection 172.28.1.17:47360

You can check the Proxysql changelogs at https://github.com/sysown/proxysql/releases/tag/v2.1.0

Consider to use Docker for integration tests

Hey @osheroff, not sure whether logging an issue is the right avenue for having a more open-ended discussion like this, let me know if I should raise this elsewhere.

Anyways, how would you feel about using Docker instead of the onetimeserver set-up for starting MySQL during integration tests? I think that'd be a bit easier to maintain, e.g. no need to upload the binaries after each MySQL release. Via Testcontainers, this would be a rather simple thing. I also felt the ontimeserver was a bit lacking in terms of diagnosability (took me quite a while to the grounds of the version issue we discussed; in hindsight it's rather simple, but when not knowing this codebase at all, the "Child exited without printing info!" message is pretty opaque). Also using Docker wouldn't require to have GCC installed for compiling the wrapper. Of course, it'd require Docker to be installed :)

Curious what you think about this?

MariaDB disconnections with error "A slave with the same server_uuid/server_uid has connected to master"

Hello,

I have two slaves running against single MariaDB master.

After a while I get and error and disconnection (had to dig this error out in debugger, it was swallowed by try-catch in line BinaryLogClient.java:970).

A slave with the same server_uuid/server_id as this slave has connected to the master; the first event '-test-bin.000015' at 238636500, the last event read from '-test-bin.000015' at 256, the last byte read from '***-bin.000015' at 238636500.

I am not sure how this should be handled. Maybe the client should support setting the slave uuid?

Set up CI via GitHub Actions

Hey @osheroff, I'd like to set up GH Actions so to build and test the project with MySQL 5.7 and 8.0 if a) PRs are sent/updated and b) something gets pushed to master. Any concerns from your side?

the update event can't resolve fields name?

UpdateRowsEventData{tableId=119, includedColumnsBeforeUpdate={0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, includedColumns={0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, rows=[ {before=[xxx], after=[xxx]} ]}

BinlogReader logging EventDataDeserializationException on app startup

We have a Java 11 Spring boot app integrated with Kafka/Debezium.

Versions

debeziumSpringbootStarterVersion=1.0.4.RELEASE
kafkaVersion=2.8.0
debeziumConnectorMysql=1.0.0.Final
mysqlBinlogConnector=0.21.0

On a Mysql (5.7) database database change is made (user dropped/ tables recreated with data) Debezium's binlogreader starts to read the bin logs, but then logs the following Exception

2021-07-22 10:05:08.584  INFO 24570 --- [debyte.com:3306] i.d.r.history.DatabaseHistoryMetrics     : Already applied 10969 database changes
2021-07-22 10:05:10.134 ERROR 24570 --- [debyte.com:3306] i.debezium.connector.mysql.BinlogReader  : Error while deserializing binlog event at offset {ts_sec=1626941231, file=mysql-bin.000532, pos=75496603, server_id=2001, event=876}.
Use the mysqlbinlog tool to view the problematic event: mysqlbinlog --start-position=82610388 --stop-position=82618497 --verbose mysql-bin.000532
2021-07-22 10:05:10.135 ERROR 24570 --- [debyte.com:3306] i.debezium.connector.mysql.BinlogReader  : Error during binlog processing. Last offset stored = {ts_sec=1626941227, file=mysql-bin.000532, pos=71066122, row=272, server_id=2001, event=518}, binlog reader near position = mysql-bin.000532/82610388
2021-07-22 10:05:10.150 ERROR 24570 --- [debyte.com:3306] i.debezium.connector.mysql.BinlogReader  : Failed due to error: Error processing binlog event

org.apache.kafka.connect.errors.ConnectException: com.github.shyiko.mysql.binlog.event.deserialization.EventDataDeserializationException: Failed to deserialize data of EventHeaderV4{timestamp=1626941231000, eventType=EXT_WRITE_ROWS, serverId=2001, headerLength=19, dataLength=8090, nextPosition=82618497, flags=0}
    at io.debezium.connector.mysql.AbstractReader.wrap(AbstractReader.java:230) ~[debezium-connector-mysql-1.0.0.Final.jar:1.0.0.Final]
    at io.debezium.connector.mysql.AbstractReader.failed(AbstractReader.java:207) ~[debezium-connector-mysql-1.0.0.Final.jar:1.0.0.Final]
    at io.debezium.connector.mysql.BinlogReader.handleEvent(BinlogReader.java:536) ~[debezium-connector-mysql-1.0.0.Final.jar:1.0.0.Final]
    at com.github.shyiko.mysql.binlog.BinaryLogClient.notifyEventListeners(BinaryLogClient.java:1158) ~[mysql-binlog-connector-java-0.21.0.jar:0.21.0]
    at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:1005) ~[mysql-binlog-connector-java-0.21.0.jar:0.21.0]
    at com.github.shyiko.mysql.binlog.BinaryLogClient.connectWithTimeout(BinaryLogClient.java:517) ~[mysql-binlog-connector-java-0.21.0.jar:0.21.0]
    at com.github.shyiko.mysql.binlog.BinaryLogClient.access$1100(BinaryLogClient.java:90) ~[mysql-binlog-connector-java-0.21.0.jar:0.21.0]
    at com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:881) ~[mysql-binlog-connector-java-0.21.0.jar:0.21.0]
    at java.base/java.lang.Thread.run(Thread.java:829) ~[na:na]
Caused by: java.lang.RuntimeException: com.github.shyiko.mysql.binlog.event.deserialization.EventDataDeserializationException: Failed to deserialize data of EventHeaderV4{timestamp=1626941231000, eventType=EXT_WRITE_ROWS, serverId=2001, headerLength=19, dataLength=8090, nextPosition=82618497, flags=0}
    at io.debezium.connector.mysql.BinlogReader.handleServerIncident(BinlogReader.java:604) ~[debezium-connector-mysql-1.0.0.Final.jar:1.0.0.Final]
    at io.debezium.connector.mysql.BinlogReader.handleEvent(BinlogReader.java:519) ~[debezium-connector-mysql-1.0.0.Final.jar:1.0.0.Final]
    ... 6 common frames omitted
Caused by: com.github.shyiko.mysql.binlog.event.deserialization.EventDataDeserializationException: Failed to deserialize data of EventHeaderV4{timestamp=1626941231000, eventType=EXT_WRITE_ROWS, serverId=2001, headerLength=19, dataLength=8090, nextPosition=82618497, flags=0}
    at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.deserializeEventData(EventDeserializer.java:300) ~[mysql-binlog-connector-java-0.21.0.jar:0.21.0]
    at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.nextEvent(EventDeserializer.java:223) ~[mysql-binlog-connector-java-0.21.0.jar:0.21.0]
    at io.debezium.connector.mysql.BinlogReader$1.nextEvent(BinlogReader.java:239) ~[debezium-connector-mysql-1.0.0.Final.jar:1.0.0.Final]
    at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:984) ~[mysql-binlog-connector-java-0.21.0.jar:0.21.0]
    ... 4 common frames omitted
Caused by: java.io.EOFException: null
    at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.read(ByteArrayInputStream.java:190) ~[mysql-binlog-connector-java-0.21.0.jar:0.21.0]
    at java.base/java.io.InputStream.read(InputStream.java:271) ~[na:na]
    at java.base/java.io.InputStream.skip(InputStream.java:531) ~[na:na]
    at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.skipToTheEndOfTheBlock(ByteArrayInputStream.java:216) ~[mysql-binlog-connector-java-0.21.0.jar:0.21.0]
    at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.deserializeEventData(EventDeserializer.java:296) ~[mysql-binlog-connector-java-0.21.0.jar:0.21.0]
    ... 7 common frames omitted

2021-07-22 10:05:10.150  INFO 24570 --- [debyte.com:3306] i.debezium.connector.mysql.BinlogReader  : Error processing binlog event, and propagating to Kafka Connect so it stops this connector. Future binlog events read before connector is shutdown will be ignored.
2021-07-22 10:05:10.152  INFO 24570 --- [debyte.com:3306] i.debezium.connector.mysql.BinlogReader  : Stopped reading binlog after 167936 events, last recorded offset: {ts_sec=1626941227, file=mysql-bin.000532, pos=71066122, row=272, server_id=2001, event=518}
2021-07-22 10:05:37.521  INFO 24570 --- [pool-1-thread-1] i.d.connector.mysql.MySqlConnectorTask   : Stopping MySQL connector task

This also is logged on application restart.

The suggested mysqlbinlog command:

mysqlbinlog --start-position=82610388 --stop-position=82618497 --verbose mysql-bin.000532

Returns

/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 4
#210722  8:06:26 server id 2001  end_log_pos 123 CRC32 0xc7bff6dc   Start: binlog v 4, server v 5.7.19-log created 210722  8:06:26
BINLOG '
Aif5YA/RBwAAdwAAAHsAAAAAAAQANS43LjE5LWxvZwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAEzgNAAgAEgAEBAQEEgAAXwAEGggAAAAICAgCAAAACgoKKioAEjQA
Adz2v8c=
'/*!*/;
# at 82610388
#210722  8:07:11 server id 2001  end_log_pos 82618497 CRC32 0xd2561ccc  Write_rows: table id 43469
WARNING: The range of printed events ends with a row event or a table map event that does not have the STMT_END_F flag set. This might be because the last statement was not fully written to the log, or because you are using a --stop-position or --stop-datetime that refers to an event in the middle of a statement. The event(s) from the partial statement have not been written to output.
SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;
DELIMITER ;
# End of log file
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

I have no idea why this is happening or how to code around it. The error is not thrown up to the app and handled internally.

Debugging the inputStream where the error is thrown from

 private EventData deserializeEventData(ByteArrayInputStream inputStream, EventHeader eventHeader,
            EventDataDeserializer eventDataDeserializer) throws EventDataDeserializationException {

       inputStream.skipToTheEndOfTheBlock();


with

new String(inputStream.readAllBytes(), StandardCharsets.UTF_8);

Does indeed throw

Method threw 'java.io.EOFException' exception.

Has anyone else had a similar issue and know what I've got wrong or need to check?

Deserialize blob data from dynamic_column

Hi,

I am working on an app to stream data from binlog to our warehouse. Few of the tables has dynamic_columns which is of blob datatype in mysql and the data comes in a byte[] through binlog. I am trying to deserialize to a Map like object and I am not able do it. Looking for suggestion to do it.

Thanks in advance.

Bharath

BinaryLogClient auto detects wrongly MariaDB instead of MySQL

Hi,

With version 0.27.1, the BinaryLogClient tries to detect if the GtidSet is a MariaDB GtidSet or not. It uses the pattern ^\d+-\d+-\d+ to detect this.
We are using MySQL 8.0 and have a GtidSet that matches this pattern. So the client thinks it is a MariaDb GtidSet and crashes when it connects because it is not MariaDb but MySQL 8.0.
Our MySQL GtidSet is the following:
07212070-4330-3bc8-8a3a-01e34be47bc3:1-141692942,a0c4a949-fae8-30f3-a4d2-fee56a1a9307:1-1427643460,a16ef643-1d4a-3fd9-a86e-1adeb836eb2d:1-1411988930,b0d822f4-5a84-30d3-a929-61f64740d7ac:1-59364

As we know that we are connecting to MySQL, can we avoid auto detection and specify to the BinaryLogClient that it needs to use MySQL and not MariaDB? I don't see the need for auto detection as the user from the library knows to which database he/she connects.

The exception we see in the logs:

java.lang.NumberFormatException: For input string: "3bc8"
at java.base/java.lang.NumberFormatException.forInputString(Unknown Source)
at java.base/java.lang.Long.parseLong(Unknown Source)
at java.base/java.lang.Long.parseLong(Unknown Source)
at com.github.shyiko.mysql.binlog.MariadbGtidSet$MariaGtid.(MariadbGtidSet.java:185)
at com.github.shyiko.mysql.binlog.MariadbGtidSet$MariaGtid.parse(MariadbGtidSet.java:189)
at com.github.shyiko.mysql.binlog.MariadbGtidSet.(MariadbGtidSet.java:39)
at com.github.shyiko.mysql.binlog.BinaryLogClient.setGtidSet(BinaryLogClient.java:355)

Polling loop doesn't finish on disconnect with Java 15

Apologies upfront for this rather vague issue report. I'm still filing it, so to raise awareness of a problem we encountered when running the Debezium test suite on Java 15. The issue we observe is that the loop in BinaryLogClient#listenForEventPackets() never finishes when the connection is closed. the peek() call driving the loop still returns a value != -1 in this case, causing the loop to never finish.

As a mitigation, we have a fix in Debezium, which bubbles up the IOException during deserialization, causing the loop to finish.

It's not quite clear why this issue happens with Java 15 only, also this seems to happen only "after some time". If you wanted to reproduce this, you could build this PR, uncommenting the fix and run mvn verify -am -pl debezium-connector-mysql -DskipTests=true -Dmaven.failsafe.debug=true -Dformat.skip=true.

Deserialize table map metadata failed with java.io.EOFException: Failed to read next byte from position 15

The optional table metadata is not sorted, when reading the type UNKNOWN_METADATA_FIELD_TYPE not in the end position cause this issue.

Refer mysql rows_event code:
https://github.com/mysql/mysql-server/blob/8.0/libbinlogevents/include/rows_event.h#L185

There are some optional metadata defined. They are listed in the table
@ref Table_table_map_event_optional_metadata. Optional metadata fields
follow null_bits. Whether binlogging an optional metadata is decided by the
server. The order is not defined, so they can be binlogged in any order.

Related sql statement:
create table workdb.debezium_all_types(idint(11) not null auto_increment,tinyitinyint(4) default '0',smallismallint(6) default '0',mediumimediumint(9) default '0',bigibigint(20) default '0',ffloat default '0.001',doudouble default '0',dedecimal(36,2) default '0.00',cchar(10) collate utf8_bin default 'char',vvarchar(255) collate utf8_bin default 'varchar',tetext collate utf8_bin,tinyttinytext collate utf8_bin,mediumtmediumtext collate utf8_bin,longtlongtext collate utf8_bin,bblob,tinybtinyblob,mediumbmediumblob,longblongblob,ddate default null,ttime not null default '18:30:30',dtdatetime default null,tstimestamp not null default current_timestamp,y year(4) default '2000', primary key (id) ) engine=innodb auto_increment=1 default charset=utf8 collate=utf8_bin;
INSERT INTO workdb.debezium_all_types_2 (tinyi,mediumi,bigi,f,dou,de,c,v,te,tinyt,mediumt,longt,b,tinyb,mediumb,longb,d,t,dt,ts,y)
VALUES (127, 8388607, 225073858, 1.17549435138, 2.225073858, 12345678910111213141516.17, 'Maria', 'Maria carchar', 'Maria text', 'Maria tiny text', 'Maria medium text', 'Maria long text', 65534, 65534, 16777214, 4294967294, now(), now(), now(), now(), '2022');

Debezium error log:
Trace: org.apache.kafka.connect.errors.ConnectException: An exception occurred in the change event producer. This connector will be stopped. at io.debezium.pipeline.ErrorHandler.setProducerThrowable(ErrorHandler.java:50) at io.debezium.connector.mysql.MySqlStreamingChangeEventSource$ReaderThreadLifecycleListener.onCommunicationFailure(MySqlStreamingChangeEventSource.java:1233) at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:980) at com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:599) at com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:857) at java.base/java.lang.Thread.run(Thread.java:834) Caused by: io.debezium.DebeziumException: Failed to deserialize data of EventHeaderV4{timestamp=1655558927000, eventType=TABLE_MAP, serverId=1, headerLength=19, dataLength=104, nextPosition=7697425, flags=0} at io.debezium.connector.mysql.MySqlStreamingChangeEventSource.wrap(MySqlStreamingChangeEventSource.java:1188) ... 5 more Caused by: com.github.shyiko.mysql.binlog.event.deserialization.EventDataDeserializationException: Failed to deserialize data of EventHeaderV4{timestamp=1655558927000, eventType=TABLE_MAP, serverId=1, headerLength=19, dataLength=104, nextPosition=7697425, flags=0} at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.deserializeEventData(EventDeserializer.java:309) at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.deserializeTableMapEventData(EventDeserializer.java:281) at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.nextEvent(EventDeserializer.java:228) at io.debezium.connector.mysql.MySqlStreamingChangeEventSource$1.nextEvent(MySqlStreamingChangeEventSource.java:230) at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:952) ... 3 more Caused by: java.io.EOFException: Failed to read next byte from position 15 at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.read(ByteArrayInputStream.java:213) at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.readPackedNumber(ByteArrayInputStream.java:173) at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.readPackedInteger(ByteArrayInputStream.java:152) at com.github.shyiko.mysql.binlog.event.deserialization.TableMapEventMetadataDeserializer.readIntegerPairs(TableMapEventMetadataDeserializer.java:164) at com.github.shyiko.mysql.binlog.event.deserialization.TableMapEventMetadataDeserializer.deserialize(TableMapEventMetadataDeserializer.java:86) at com.github.shyiko.mysql.binlog.event.deserialization.TableMapEventDataDeserializer.deserialize(TableMapEventDataDeserializer.java:47) at com.github.shyiko.mysql.binlog.event.deserialization.TableMapEventDataDeserializer.deserialize(TableMapEventDataDeserializer.java:27) at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.deserializeEventData(EventDeserializer.java:303) ... 7 more

isContainedWithin on MariadbGtidSet breaks the contract of it's super method

com.github.shyiko.mysql.binlog.GtidSet#isContainedWithin says the following in its doc:

Determine if the GTIDs represented by this object are contained completely within the supplied set of GTIDs. Note that if two GtidSets are equal, then they both are subsets of the other.
Params:
other โ€“ the other set of GTIDs; may be null
Returns:
true if all of the GTIDs in this set are equal to or completely contained within the supplied set of GTIDs, or false otherwise

Unfortunately, MariadbGtidSet's override doesn't behave like this -- if the sets are equivalent, it returns false even though sets are subsets of themselves, so it can't be used to test logical equivalence. I noticed this when I tried to test if set a and b were logically equivalent by using a.isContainedWithin(b) && b.isContainedWithin(a)

This is a PR I opened to address this: #90

replication thread stuck as sslsocket lack of timeout

There are news bin logs in MySQL master, but BinaryLogClient can't capture them.
Suggest to set SO-TIMEOUT for SSLSocket.

time: 18:06
 
"blc-a.b.c.9:3306" #41 prio=5 os_prio=0 cpu=53999.75ms elapsed=33186.63s tid=0x000056177f342000 nid=0x42 runnable  [0x00007f0c43568000]
   java.lang.Thread.State: RUNNABLE
       at java.net.SocketInputStream.socketRead0([email protected]/Native Method)
       at java.net.SocketInputStream.socketRead([email protected]/SocketInputStream.java:115)
       at java.net.SocketInputStream.read([email protected]/SocketInputStream.java:168)
       at java.net.SocketInputStream.read([email protected]/SocketInputStream.java:140)
       at sun.security.ssl.SSLSocketInputRecord.read([email protected]/SSLSocketInputRecord.java:476)
       at sun.security.ssl.SSLSocketInputRecord.readHeader([email protected]/SSLSocketInputRecord.java:470)
       at sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket([email protected]/SSLSocketInputRecord.java:70)
       at sun.security.ssl.SSLSocketImpl.readApplicationRecord([email protected]/SSLSocketImpl.java:1354)
       at sun.security.ssl.SSLSocketImpl$AppInputStream.read([email protected]/SSLSocketImpl.java:963)
       at sun.security.ssl.SSLSocketImpl$AppInputStream.read([email protected]/SSLSocketImpl.java:870)
       at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.readWithinBlockBoundaries(ByteArrayInputStream.java:222)
       at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.peek(ByteArrayInputStream.java:194)
       at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:930)
       at com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:606)
       at com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:850)
       at java.lang.Thread.run([email protected]/Thread.java:834)
 

 time: 10-29 18:40
 
"blc-a.b.c.9:3306" #41 prio=5 os_prio=0 cpu=53999.75ms elapsed=35226.59s tid=0x000056177f342000 nid=0x42 runnable  [0x00007f0c43568000]
   java.lang.Thread.State: RUNNABLE
       at java.net.SocketInputStream.socketRead0([email protected]/Native Method)
       at java.net.SocketInputStream.socketRead([email protected]/SocketInputStream.java:115)
       at java.net.SocketInputStream.read([email protected]/SocketInputStream.java:168)
       at java.net.SocketInputStream.read([email protected]/SocketInputStream.java:140)
       at sun.security.ssl.SSLSocketInputRecord.read([email protected]/SSLSocketInputRecord.java:476)
       at sun.security.ssl.SSLSocketInputRecord.readHeader([email protected]/SSLSocketInputRecord.java:470)
       at sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket([email protected]/SSLSocketInputRecord.java:70)
       at sun.security.ssl.SSLSocketImpl.readApplicationRecord([email protected]/SSLSocketImpl.java:1354)
       at sun.security.ssl.SSLSocketImpl$AppInputStream.read([email protected]/SSLSocketImpl.java:963)
       at sun.security.ssl.SSLSocketImpl$AppInputStream.read([email protected]/SSLSocketImpl.java:870)
       at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.readWithinBlockBoundaries(ByteArrayInputStream.java:222)
       at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.peek(ByteArrayInputStream.java:194)
       at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:930)
       at com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:606)
       at com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:850)
       at java.lang.Thread.run([email protected]/Thread.java:834)
 
   Locked ownable synchronizers:
       - <0x0000000680ed3ea8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
       - <0x0000000680ed3f98> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)

Maintainership of this repo

Hey @osheroff, I hope this finds you well. Didn't know how to reach out, hence opening this issue.

It seems there hasn't been much activity in this repo, so I was wondering what your plans are. We (Debezium team) got a bug report recently for which we'd likely need to push a fix to this project, and have a release too. I can take care of merging the patch (once we have it), but releases still are limited to you. Do you think it would make sense for us to pick up ownership and e.g. create a fork using our own coordinates? We'd be up for it if you don't plan to further maintain this; but if you are, I think we'd be happier if we could keep pushing fixes here (and, credit where credit is due, benefitting from any ones you do of course). Would be great to hear about your thoughts and plans. Thanks a lot!

Issue with date transformation using debezium

We use Debezium with MySQL connector to get data from a MariaDB server. A few months ago we upgraded our MariaDB to version 10.4 and after that, we started having this issue with the connectors.
Some colummns on a table are datetime(6).

Task threw an uncaught and unrecoverable exception
org.apache.kafka.connect.errors.ConnectException: Invalid value for MonthOfYear (valid values 1 - 12): -83
        at io.debezium.connector.mysql.AbstractReader.wrap(AbstractReader.java:241)
        at io.debezium.connector.mysql.AbstractReader.failed(AbstractReader.java:207)
        at io.debezium.connector.mysql.BinlogReader$ReaderThreadLifecycleListener.onEventDeserializationFailure(BinlogReader.java:1150)
        at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:965)
        at com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:599)
        at com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:857)
        at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.time.DateTimeException: Invalid value for MonthOfYear (valid values 1 - 12): -83
        at java.base/java.time.temporal.ValueRange.checkValidValue(ValueRange.java:311)
        at java.base/java.time.temporal.ChronoField.checkValidValue(ChronoField.java:717)
        at java.base/java.time.LocalDate.of(LocalDate.java:269)
        at java.base/java.time.LocalDateTime.of(LocalDateTime.java:361)
        at io.debezium.connector.mysql.RowDeserializers.deserializeDatetime(RowDeserializers.java:378)
        at io.debezium.connector.mysql.RowDeserializers$WriteRowsDeserializer.deserializeDatetime(RowDeserializers.java:202)
        at com.github.shyiko.mysql.binlog.event.deserialization.AbstractRowsEventDataDeserializer.deserializeCell(AbstractRowsEventDataDeserializer.java:181)
        at com.github.shyiko.mysql.binlog.event.deserialization.AbstractRowsEventDataDeserializer.deserializeRow(AbstractRowsEventDataDeserializer.java:143)
        at com.github.shyiko.mysql.binlog.event.deserialization.WriteRowsEventDataDeserializer.deserializeRows(WriteRowsEventDataDeserializer.java:64)
        at com.github.shyiko.mysql.binlog.event.deserialization.WriteRowsEventDataDeserializer.deserialize(WriteRowsEventDataDeserializer.java:56)
        at com.github.shyiko.mysql.binlog.event.deserialization.WriteRowsEventDataDeserializer.deserialize(WriteRowsEventDataDeserializer.java:32)
        at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.deserializeEventData(EventDeserializer.java:303)
        at com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.nextEvent(EventDeserializer.java:232)
        at io.debezium.connector.mysql.BinlogReader$1.nextEvent(BinlogReader.java:252)
        at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:952)
        ... 3 more

We noticed from the binlog file that although the statement is correct the fields have a strange date format.

#Q> INSERT INTO `some_table` (`tsCreated`, `tsUpdated`, `label_id`, `customer_id`, `category_id`, `product_id`, `te_id`, `lot_id`, `page`) VALUES ('2021-11-01 10:46:25.219851', '2021-11-01 10:46:25.219878'
, 1, NULL, NULL, 139153, '017c8322f53e00059e3b6c9d48bb000780021070004a0', NULL, 'auction')
#211101 11:46:25 server id 204203  end_log_pos 170739039 CRC32 0x055ccef1       Table_map: `database`.`some_table` mapped to number 97
# at 170739039
#211101 11:46:25 server id 204203  end_log_pos 170739157 CRC32 0xc82864ea       Write_rows: table id 97 flags: STMT_END_F
### INSERT INTO `database`.`some_table`
### SET
###   @1=219190348
###   @2='85489092-40-24 33:28:01'
###   @3='280044596-30-48 38:70:73'
###   @4=1
###   @5='017c8322f53e00059e3b6c9d48bb000780021070004a0'
###   @6='auction'
###   @7=139153
###   @8=NULL
###   @9=NULL
###   @10=NULL
# Number of rows: 1
# at 170739157
#211101 11:46:25 server id 204203  end_log_pos 170739188 CRC32 0x8dec9c7a       Xid = 2376154080
COMMIT/*!*/;

This binlog is already from a replicated database, but the master shows the same behavior, and looking into the database itself, the data is all correct.
So, we believe that problem is not in MariaDB itself but in the client-side.
And the issue is pretty similar to this one: https://jira.mariadb.org/browse/MDEV-20860
We did try to upgrade java, Debezium, and our latest try was to upgrade only this binlog library to version 0.25.4, but nothing helped.

Fix in "upstream" not "synced" to this repo

We recently switched over to using your fork rather than the original version, as a result of that I checked the difference between original/"upstream" and this fork, using: master...shyiko:master

It appears that the fix for shyiko#321 is not included in this repo, as it was commited after "time-of-fork". In particular it's the following two commits that could potentially be of use here too: shyiko/mysql-binlog-connector-java@0.20.1...c54b4c1

Cherry-picking the first commit, e59006e, worked without any hassle. Things got a bit hairy when trying to apply c54b4c1 as lines in the same file were touched later on.

I will try to put up a PR, but preemptively created this issue in case I fail :-)

Thanks for keeping the project alive!

eventHeaderDeserializer memory leak cause full gc

Background

The company's trading center has sub-databases and sub-tables, the mysql database has 32 schemas, and each schema has 32 tables; it is equivalent to a logical table corresponding to 32*32=1024 physical tables. Currently synchronizing data to kafka through flink-connector-mysql-cdc 2.1, it is found that as long as the job runs for about a week, full gc will be frequent.

Analyze

  1. Dump the memory through jmap and find that tableMapEventByTableId consumes a lot of memory. Because there are two physical libraries, this stream will read the binlogs of the two libraries at the same time, which is why there are two same memory problems below.
  2. tableMapEventByTableId is an instance of LRUCache. LRUCache extends from LinkedHashMap and implements the cleaning method. In theory, it should not operate 10,000 nodes.
public EventDeserializer(
            EventHeaderDeserializer eventHeaderDeserializer,
            EventDataDeserializer defaultEventDataDeserializer
    ) {
        this.eventHeaderDeserializer = eventHeaderDeserializer;
        this.defaultEventDataDeserializer = defaultEventDataDeserializer;
        this.eventDataDeserializers = new IdentityHashMap<EventType, EventDataDeserializer>();
        this.tableMapEventByTableId = new LRUCache<>(100, 0.75f, 10000);
        registerDefaultEventDataDeserializers();
        afterEventDataDeserializerSet(null);
    }
public class LRUCache<K,V> extends LinkedHashMap<K,V> {
    private int maxSize;

    public LRUCache(int initialCapacity, float loadFactor, int maxSize) {
        super(initialCapacity, loadFactor, true);
        this.maxSize = maxSize;
    }

    @Override
    protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
        return size() > maxSize;
    }
}

May LinkedHashMap cause memory leak?

Add a document with screenshots of mat analysis.
https://pyie1bcd1x.feishu.cn/docs/doccnpn6mZ6rU1vrmAClIOnc8ec#

TableMapEventMetadata is all ways null

TableMapEventData data = event.getData();
TableMapEventMetadata tableMapEventMetadata = data.getEventMetadata();
List columnNames = tableMapEventMetadata.getColumnNames();

Question on compatibility

Hey @osheroff, Gunnar here from the Debezium team. I just learned about this fork of Shyiko's connector, which he recommends instead of his abandoned one. Are you planning to maintain this in the long run? There seem to be quite a few commits since Shyiko's version, is it still a drop-in replacement we could "just" use in Debezium?

Lastly, a more broadly-scoped question: would you perhaps be interested in making maintenance of this a joint effort, potentially within the debezium organization on GitHub?

All the Best,

--Gunnar

Using mariadb_slave_capability=1 breaks GTID events

With release 0.27.1 (this commit specifically).

With mariadb_slave_capability=1 the connector does not get MARIADB_GTID events at all. With mariadb_slave_capability=4 those events are there.

I am not aware of the reasons behind setting the capability mode to 1, but I would suggest at least making it configurable.

Please, see the attached test-case. It works in 0.27.0 and fails with next versions.

test-case.zip

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.