Giter VIP home page Giter VIP logo

Comments (1)

chriscomeau79 avatar chriscomeau79 commented on June 22, 2024

A few more notes - I think there was a deleted comment but I'll leave the reply:

The FixedString(26) version of the ULID still compresses reasonably well, since the extra bits are all zeroes. This optimization to get it down from 26 bytes to 16 is more about uncompressed memory usage and other benefits like fitting in 128 bits for memory alignment, as well as being able to go to/from UInt128 directly.

Another way to put it, it would be nice to be able to take any UInt128 and get a ULID string from it, which should be possible.

The ULID generation spec, where it's the 48-bit timestamp and 80 bits of randomness, happens to be a good way to generate well-behaved UInt128 equivalents which compress well. That's what I was trying to get at with the example here, showing how python-ulid can accept the max UInt128 and give this result. ClickHouse can do the same thing with something like UInt128ToULIDString, ULIDStringToUInt128.

Round trip examples:

select UInt128ToULIDString(340282366920938463463374607431768211455);
-- '7ZZZZZZZZZZZZZZZZZZZZZZZZZ'

select ULIDStringToUInt128('7ZZZZZZZZZZZZZZZZZZZZZZZZZ');
-- 340282366920938463463374607431768211455

select UInt128ToULIDString(0);
-- '00000000000000000000000000'

select ULIDStringToUInt128('00000000000000000000000000');
-- 0

The process of generating ULIDs would still behave the same as in the spec, so we get those benefits with locality and compression. The internal representation would just be those 128 bits with the behaviour built in to display as a ULID string.

select generateULID();
-- '01HYSRYCB8G8DF3DT60C0K9GX1'

select generateULID();
-- '01HYSRYCB8PNC1T5DA7JRRR66V'

If it's useful, could generalize this to do the same thing with UInt256 and strings that are twice as long but otherwise follow the ULID convention. It's a nice way to deal with such long numbers. Hypothetically the same style of generation would work too, with 48 bits timestamp + 208 bits random, but it's hard to think of a scenario where that would be needed. Could call that a ULID256.

from clickhouse.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.