Giter VIP home page Giter VIP logo

Comments (1)

feikesteenbergen avatar feikesteenbergen commented on June 14, 2024

The answer to this problem requires some details, as there's quite a few things to unpack here.

Doesn't this mean Timescale can potentially create a corrupted database state (if used without backing up WAL files), as what I'm seeing in the first snippet above?

No.

There is no corrupted database state in the problem you describe, there is however the following problem:

  • The primary has advanced more than the replica kan keep up with.

Without going into too much detail, WAL is a perpetual stream of changes generated by the primary.
The WAL Volume however is of limited size, therefore WAL needs to be removed at some point.

To allow replica's to follow the primary a few things will prevent WAL from being deleted:

  • Usage of replication slots; a replication slot basically points at a position in the WAL stream.
    If the replication slot is not advanced, the WAL is not removed
  • wal_keep_segments, by setting this value higher, you enforce the primary to keep more WAL.
  • max_wal_size by choosing a higher max_wal_size you make it more likely WAL will not quickly be removed.

this means that by default, timescale will delete WAL file (due to the exit code 0), even with no backup provided.

Yes, without a backup method, we do not have any option other than to remove WAL files, otherwise the database
would have to stop once the WAL Volume is filled with WAL. I would therefore advise everyone to enable the backup.

or have the option to disable the use of WAL for recovery (if that's possible).

WAL is the foundation of replication in PostgreSQL, so therefore it cannot be disabled.

If so, I think maybe we should enable backup by default

We can not easily enable it by default. These Helm Charts allow you to run TimescaleDB on every Kubernetes cluster.
However, the backup requires an s3-compatible object storage to be available to the user.

Oh apparently there's already basebackup as default :
Line 179 in 286b1fb

  • basebackup

This value describes the bootstrap method, not the backup method.

The bootstrap method of Patroni defines
how a replica is created. By default Patroni will use pg_basebackup to create a replica if no backup method is available.

000000030000000100000020 has already been removed

I cannot fully understand your situation, however this does tell me a few things:

  • you have had 2 failovers in your deployment (first part of the filename is 00000003, so you're on timeline 3)
  • You have not had a lot of changes yet in your deployment (0000000100000020), you've only burned through 20 WAL files.

I'm guessing that you've had 2 failovers in quick succession; if the requested WAL segment cannot be found and no backup
is available, the solution would be for the failing replica, is to clean out the Data and Wal Volume.

If a backup would have been available, this WAL segment would have been able to be fetched from the archives.

from helm-charts.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.