Giter VIP home page Giter VIP logo

Comments (3)

dontstopbelieveing avatar dontstopbelieveing commented on June 2, 2024

Created this PR for updating the documentation - #1388

from gh-ost.

timvaillancourt avatar timvaillancourt commented on June 2, 2024

after this error gh-ost went back to the migrating stage, supposedly to do rollbacks which caused an availability issue and we had to kill the process. What is it rolling back, since no cutover was done, should it not simply abandon things and release locks?

@dontstopbelieveing could you elaborate on this point?

To my knowledge gh-ost doesn't have any concept of "rolling something back", so I wonder if what you're seeing is a MySQL/InnoDB behaviour

from gh-ost.

dontstopbelieveing avatar dontstopbelieveing commented on June 2, 2024

I'll add what we see in our test here, we land at the postpone cutover stage

Copy: 100000000/100000000 100.0%; Applied: 751753; Backlog: 0/1000; Time: 1h27m30s(total), 1h8m1s(copy); streamer: mysql-bin-changelog.232697:134650910; Lag: 0.09s, HeartbeatLag: 9.49s, State: postponing cut-over; ETA: due
[2024/03/09 11:31:04] [info] binlogsyncer.go:723 rotate to (mysql-bin-changelog.232698, 4)
[2024/03/09 11:31:04] [info] binlogsyncer.go:723 rotate to (mysql-bin-changelog.232698, 4)
2024-03-09 11:31:04 INFO rotate to next log from mysql-bin-changelog.232698:134667787 to mysql-bin-changelog.232698
2024-03-09 11:31:04 INFO rotate to next log from mysql-bin-changelog.232698:0 to mysql-bin-changelog.232698
Copy: 100000000/100000000 100.0%; Applied: 752224; Backlog: 0/1000; Time: 1h28m0s(total), 1h8m1s(copy); streamer: mysql-bin-changelog.232698:134396827; Lag: 0.09s, HeartbeatLag: 18.59s, State: postponing cut-over; ETA: due
[2024/03/09 11:31:24] [info] binlogsyncer.go:723 rotate to (mysql-bin-changelog.232699, 4)
[2024/03/09 11:31:24] [info] binlogsyncer.go:723 rotate to (mysql-bin-changelog.232699, 4)
2024-03-09 11:31:24 INFO rotate to next log from mysql-bin-changelog.232699:134397338 to mysql-bin-changelog.232699
2024-03-09 11:31:24 INFO rotate to next log from mysql-bin-changelog.232699:0 to mysql-bin-changelog.232699
2024-03-09 11:31:24 INFO rotate to next log from mysql-bin-changelog.232699:0 to mysql-bin-changelog.232699

And then once we delete the cutover flag

2024-03-09 11:31:27 INFO Grabbing voluntary lock: gh-ost.2374.lock
2024-03-09 11:31:27 INFO Setting LOCK timeout as 6 seconds
2024-03-09 11:31:27 INFO Looking for magic cut-over table
2024-03-09 11:31:27 INFO Creating magic cut-over table `sbtest`.`_sbtest1_del`
2024-03-09 11:31:27 INFO Magic cut-over table created
2024-03-09 11:31:27 INFO Locking `sbtest`.`sbtest1`, `sbtest`.`_sbtest1_del`
2024-03-09 11:31:27 INFO Tables locked
2024-03-09 11:31:27 INFO Session locking original & magic tables is 2374
2024-03-09 11:31:27 INFO Writing changelog state: AllEventsUpToLockProcessed:1709983887171918805
2024-03-09 11:31:27 INFO Waiting for events up to lock
2024-03-09 11:31:30 ERROR Timeout while waiting for events up to lock
2024-03-09 11:31:30 ERROR 2024-03-09 11:31:30 ERROR Timeout while waiting for events up to lock
2024-03-09 11:31:30 INFO Looking for magic cut-over table
2024-03-09 11:31:30 INFO Will now proceed to drop magic table and unlock tables
2024-03-09 11:31:30 INFO Dropping magic cut-over table
2024-03-09 11:31:30 INFO Dropping magic cut-over table
2024-03-09 11:31:30 INFO Dropping table `sbtest`.`_sbtest1_del`

So far so good. At this point I would expect metadata locks to be released. But they don't get released and the log has these entries

Copy: 100000000/100000000 100.0%; Applied: 752633; Backlog: 0/1000; Time: 1h28m10s(total), 1h8m1s(copy); streamer: mysql-bin-changelog.232699:134374965; Lag: 0.09s, HeartbeatLag: 7.19s, State: migrating; ETA: due
Copy: 100000000/100000000 100.0%; Applied: 752633; Backlog: 0/1000; Time: 1h28m15s(total), 1h8m1s(copy); streamer: mysql-bin-changelog.232699:134374965; Lag: 0.09s, HeartbeatLag: 12.19s, State: migrating; ETA: due
Copy: 100000000/100000000 100.0%; Applied: 752633; Backlog: 0/1000; Time: 1h28m20s(total), 1h8m1s(copy); streamer: mysql-bin-changelog.232699:134374965; Lag: 0.09s, HeartbeatLag: 17.19s, State: migrating; ETA: due
Copy: 100000000/100000000 100.0%; Applied: 752633; Backlog: 0/1000; Time: 1h28m25s(total), 1h8m1s(copy); streamer: mysql-bin-changelog.232699:134374965; Lag: 0.09s, HeartbeatLag: 22.19s, State: migrating; ETA: due
2024-03-09 11:31:46 INFO rotate to next log from mysql-bin-changelog.232700:134391842 to mysql-bin-changelog.232700
2024-03-09 11:31:46 INFO rotate to next log from mysql-bin-changelog.232700:0 to mysql-bin-changelog.232700
[2024/03/09 11:31:46] [info] binlogsyncer.go:723 rotate to (mysql-bin-changelog.232700, 4)
[2024/03/09 11:31:46] [info] binlogsyncer.go:723 rotate to (mysql-bin-changelog.232700, 4)
2024-03-09 11:31:47 INFO Intercepted changelog state AllEventsUpToLockProcessed
2024-03-09 11:31:47 INFO Handled changelog state AllEventsUpToLockProcessed
Copy: 100000000/100000000 100.0%; Applied: 752689; Backlog: 0/1000; Time: 1h28m30s(total), 1h8m1s(copy); streamer: mysql-bin-changelog.232700:135022773; Lag: 0.09s, HeartbeatLag: 5.79s, State: migrating; ETA: due
Copy: 100000000/100000000 100.0%; Applied: 752689; Backlog: 0/1000; Time: 1h28m35s(total), 1h8m1s(copy); streamer: mysql-bin-changelog.232700:135022773; Lag: 0.09s, HeartbeatLag: 10.79s, State: migrating; ETA: due
Copy: 100000000/100000000 100.0%; Applied: 752689; Backlog: 0/1000; Time: 1h28m40s(total), 1h8m1s(copy); streamer: mysql-bin-changelog.232700:135022773; Lag: 0.09s, HeartbeatLag: 15.79s, State: migrating; ETA: due
Copy: 100000000/100000000 100.0%; Applied: 752689; Backlog: 0/1000; Time: 1h28m45s(total), 1h8m1s(copy); streamer: mysql-bin-changelog.232700:135022773; Lag: 0.09s, HeartbeatLag: 20.79s, State: migrating; ETA: due

This continued till we manually killed the gh-ost process.

I am baffled about two issues -

  1. After the error the metadata locks should have been released and possibly also gh-ost should exit, but this did not happen.
  2. I am confused what's the migrating stage after is it a retry? Again even the retry should release metadata locks and reacquire IMO.

The reason I said "rolling something back" is because the effect I see on MySQL is similar to if a long running transaction might do a rollback. This might not be a rollback but gh-ost running something else.

from gh-ost.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.