Giter VIP home page Giter VIP logo

Comments (7)

gouthamve avatar gouthamve commented on June 18, 2024

If we are going to do retention policies it via tombstones, we need some-kind of a rolling info in the tombstone. Something like:

  1. seriesA 1200-1600hr (timespan) Deleted
  2. seriesB anything older than 10days Retention

Now if we are going to specify retention via a config file, we might need to change every-block on reload. We also need to compact blocks that have reached full-sizes periodically.

This is assuming we are not allowed to view metrics immediately after they expire, if that is not the case, then only the compactor has the retention-information and the metrics stay around until compaction.

from tsdb.

gouthamve avatar gouthamve commented on June 18, 2024

For implementing deletions we need to support time-range deletions too. One chunk could have multiple deleted and valid-ranges.
My plan to implement this:

  1. Store a deleted postings list with [{mint, maxt}] (deleted-ranges) in the index.
  2. We embed a new field, deletedRanges: [{mint, maxt}] in the ChunkMeta.
  3. When looking up the ChunkMeta, the index populates the meta with the field.
  4. The iterator over ChunkMeta simply skips the time-ranges.

One optimization could be we store the fully deleted postings and expose a Deleted() Postings on IndexReader that the QE can intersect with to remove the fully deleted series. But how to detect which series has been fully-deleted on a persisted-index is to be determined.

Haven't looked into the compaction code-base yet, but it should be straight-forward there.

Does this sound okay?

from tsdb.

gouthamve avatar gouthamve commented on June 18, 2024

So I have been thinking a little more about this. Do we want to support deletion of time-ranges? Or deletion beyond a time, i.e, "Delete all metrics in the time-series older than t0".

If this is the scenario, it makes it a wee-bit easier to implement this but the approach will be the one mentioned above.

from tsdb.

gouthamve avatar gouthamve commented on June 18, 2024

We were thinking of using tombstones on headBlocks also. But influx has a similar storage scenario and they are removing the entries that are in-memory when doing deletes.

Ref: Last paragraph under https://github.com/influxdata/influxdb/blob/master/tsdb/engine/tsm1/DESIGN.md#data-flow

Haven't dug into the code yet, but I think they are removing the data rather than removing the entries in the index. If we can drop chunks and data from memSeries, then not doing tombstones for in-memory data might be better.

But all of this needs to be benchmarked and validated.

from tsdb.

fabxc avatar fabxc commented on June 18, 2024

Edit: actually had this tab open for way too long and didn't see your updates from yesterday when writing this. If we can drop data immediately, sure, that's great. But likely more expensive and error-prone than just using tombstones (postings lists are not update-friendly, neither are our compressed chunks). The data will be gone once it's compacted anyway. Deletes are rare.

Let's focus on deletions for now (as would be done by the user).
I think retention policies could simply be implemented on top by doing a delete+compact cycle in the foreground. They also don't have to be strict. Just running this cycle every 1h would be fine IMO.

General (non-)requirements around deletions:

  • they should be visible as soon as the deletion call returns
  • expected to be very rare and in bulk (compared to append writes)
  • okay to be slow

Regarding your proposal:
We generally expect deletes to happen against persisted and in-memory blocks equally. The in-memory block could be updated by the index entries you describe, the persisted blocks cannot.

This would be equivalent to a re-compaction of the index as its not feasible to start randomly adding and moving bytes in the existing index file.
So by this approach a deletion request would have to synchronously write a new index file. Now this would be perfectly okay. Deletions may be slow, are rare, and in bulk.
If it was a bulk delete though, this would not reduce the size much because the sample data is still around. The next full compaction of the block would then care about removing the actual samples by dropping full chunks or dropping some samples from them. The latter could also care about rewriting chunks so we don't end up with 2 samples chunks – that's probably just an optimisation for later though.

What this does do overall is introduce a fair bit of noise into our index format for things that should only be ephemeral until the next full compaction.
An alternative here is having tombstones separately tracked in a WAL, to which we append deletes. All tombstones are additionally in an in-memory data structure. The elements in there are then considered, as you described, when going over iterators.

At the next compaction the tombstones are fully applied by removing the data from the in the newly compacted block. Doing some quick math for in-memory size of the tombstone tracking:

1e6 series x 24 bytes (series ID, start, end) = 24 MiB

That seems pretty reasonable to track 1 million deletes (but would multiply by affected blocks of course). That should be stored in some form of postings list as you described to be considered when querying.
What I did not quite understand was your need for having the entries in the postings list AND in the ChunkMeta directly.

Some care must be taken to correctly rebuild the sorted delete-postings list if there are several deletes done after another.

Overall this would not add complexity to the core files. Memory footprint is largely negligible. It is a fair bit faster and does not require a full rewrite if we are just deleting 2KB worth of samples from a 5GB block.
We could dynamically decide whether the amount of tombstones justifies a full re-compaction or whether we just keep the few tombstones around.

Generally, I think I'd lean towards making it part of the IndexReader as you said. This means persisted blocks also get an IndexWriter to perform deletions and their indexReader implementation gets additional in-memory structures.
The logging of tombstones could just be added as another entry type to the existing WAL. This means that persisted blocks also get a WAL now... things are getting more complex, but there's no real way around that :)

from tsdb.

gouthamve avatar gouthamve commented on June 18, 2024

Okay, so we are going to support deleting time-ranges and not delete older than t0.

What I did not quite understand was your need for having the entries in the postings list AND in the ChunkMeta directly.

We won't be modifying the ChunkMeta directly. When looking up the ChunkMeta, we would populate the deleted ranges. This is because a chunk can have partially-deleted data and the iterator on it needs to know that info.

+1 for a separate tombstone file, it is cleaner than modifying the index. When we load the index we load the info from this file too. Though, for persisted blocks, it could be just tombstones instead of inside a WAL.
And for in-mem blocks, it would be just added to the WAL.
But that is again an implementation detail and can be decided upon later.

from tsdb.

gouthamve avatar gouthamve commented on June 18, 2024

Closed via #82

from tsdb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.