Giter VIP home page Giter VIP logo

idgen's Issues

Provide option to generate UUID string representation that will round-trip between .NET and Java/Javascript

First, very nice tool!

Second, it follows the UUID spec very well. But there is one issue: the generated UUID string cannot be used as an argument to System.Guid(string) constructor which would result in an identical Guid. In other words this is currently not true:

Console.WriteLine(
    new Guid("de9425a4-e8dd-510b-8e00-b6ac890c733a") == new Guid(
        new byte[] {
            0xde, 0x94, 0x25, 0xa4,
            0xe8, 0xdd,
            0x51, 0x0b,
            0x8e, 0x00, 0xb6, 0xac, 0x89, 0x0c, 0x73, 0x3a
        }
    )
);

// false

The System.Guid implementation stores Guids internally in the following format:

public struct Guid
{
    public int a;
    public short b;
    public short c;
    public byte d;
    public byte e;
    public byte f;
    public byte g;
    public byte h;
    public byte i;
    public byte j;
    public byte k;
}

In .NET, integers are stored in machine-byte order. And for Intel-based machines, that's Little-Endian. (Internally, Guid will convert a given byte array to store the bytes in the fields shown above. Oddly enough, when calling ToByteArray() on a Guid, it "undoes" the conversion and outputs the bytes in network byte order (i.e. Big-Endian). And even more oddly, .ToString() simply shows the string representation of the individual components, concatenated in a format complying with how UUIDs are segmented according to RFC4122, BUT, no conversion of (specifically) the a, b and c components is performed. So they display in their Little-Endian format. Way to go MS for standards-based compliance and interoperability.

I appreciate that this tool follows the UUID spec and provides for similarly formatted output as to what you'd get by calling Guid.ToString. However, it would be nice to provide an option to have the tool output its result that can be used directly with the System.Guid(string) constructor with no further conversion and that would result in the Guid's internal byte representation resulting in the exact same UUID. In other words, it would nice to be able to do this:

$g = [Guid]::new($(dotnet idgen v5 -le bojangles '11de2b26-984e-56b4-aa25-b3bd28ea5ac2'))
$areEqual = (
    "$($g.ToByteArray() | % { $_.ToString('x2') })" -replace ' '
) -eq (
    $(dotnet idgen v5 bojangles '11de2b26-984e-56b4-aa25-b3bd28ea5ac2') -replace '-'
)

$areEqual
# True

The reason this is important is for interoperability between, say, Java (and Javascript) code and .NET code.

So using your documentation's bojangles example to further demonstrate the issue:

var g1 = new Guid("11de2b26-984e-56b4-aa25-b3bd28ea5ac2");
// Now hash 'bojangles' and use the UUID v5 algorithm to create a new GUID
var namespaceBytes = g1.ToByteArray();
var bojanglesBytes = System.Text.Encoding.UTF8.GetBytes("bojangles");

// Create a new byte array to hold the new GUID bytes
var bytes = new byte[namespaceBytes.Length + bojanglesBytes.Length];

Array.Copy(namespaceBytes, bytes, namespaceBytes.Length);
Array.Copy(bojanglesBytes, 0, bytes, namespaceBytes.Length, bojanglesBytes.Length);

// Now SHA1 hash the resulting byte array
var sha1Hasher = System.Security.Cryptography.SHA1CryptoServiceProvider.Create();
var bojanglesUUIDHash = sha1Hasher.ComputHash(bytes);

// We're creating a v5 UUID...
bojanglesUUIDHash[6] = ((bojanglesUUIDHash[6] & 0x0F) | 0x50);

// And conform to RFC4122...
bojanglesUUIDHash[8] = ((bojanglesUUIDHash[8] & 0x3f) | 0x80);

Array.Resize(bojanglesUUIDHash, 16);
var newV5Uuid = new Guid(bojanglesUUIDHash);

// To perform a proper comparison, the stringified expected v5 UUID result needs to be
// converted to a byte array and the first 3 components converted to little-endian
// representation, which is how Guids are stored internally:
var expectedResultBytes = new Guid("de9425a4-e8dd-510b-8e00-b6ac890c733a").ToByteArray();
Array.Reverse(expectedResultBytes, 0, 4);
Array.Reverse(expectedResultBytes, 4, 2);
Array.Reverse(expectedResultBytes, 6, 2);
var expectedResult = new Guid(expectedResultBytes);

// And now compare the results
Console.WriteLine(newV5Uuid.Equals(expectedResult));
// Should output 'false'.

To get an equivalent GUID using the same namespace GUID and 'bojangles' text, you could do either string manipulation (to convert the string to a byte array) or byte array manipulation on the namespace UUID so that the first 3 components have a Little Endian byte order. I'll do byte array manipulation since that's easier for me to do here:

var g1 = new Guid("11de2b26-984e-56b4-aa25-b3bd28ea5ac2");
var namespaceBytes = g1.ToByteArray();
// Convert the bytes representing the a, b, and c components to little-endian byte order:
Array.Reverse(namespaceBytes, 0, 4);
Array.Reverse(namespaceBytes, 4, 2);
Array.Reverse(namespaceBytes, 6, 2);

var bojanglesBytes = System.Text.Encoding.UTF8.GetBytes("bojangles");

var bytes = new byte[namespaceBytes.Length + bojanglesBytes.Length];

Array.Copy(namespaceBytes, bytes, namespaceBytes.Length);
Array.Copy(bojanglesBytes, 0, bytes, namespaceBytes.Length, bojanglesBytes.Length);

var sha1Hasher = System.Security.Cryptography.SHA1CryptoServiceProvider.Create();
var bojanglesUUIDHash = sha1Hasher.ComputHash(bytes);

// Make sure this is a RFC4122 compliant v5 UUID...
bojanglesUUIDHash[6] = ((bojanglesUUIDHash[6] & 0x0F) | 0x50);
bojanglesUUIDHash[8] = ((bojanglesUUIDHash[8] & 0x3f) | 0x80);

Array.Resize(bojanglesUUIDHash, 16);
var newV5Uuid = new Guid(bojanglesUUIDHash);

// Again, to perform a proper comparison, the stringified expected v5 UUID needs some of its
// bytes reordered to Little-Endian ordering
var expectedResultBytes = new Guid("de9425a4-e8dd-510b-8e00-b6ac890c733a").ToByteArray();
Array.Reverse(expectedResultBytes, 0, 4);
Array.Reverse(expectedResultBytes, 4, 2);
Array.Reverse(expectedResultBytes, 6, 2);
var expectedResult = new Guid(expectedResultBytes);

Console.WriteLine(newV5Uuid..Equals(expectedResult));
// Should output 'true'

Now, it's true that if you ToString the newV5Uuid Guid, it won't output the same as according to an RFC4122-compliant UUID, as explained at the beginning. No Little-Endian to Big-Endian conversion is performed when stringifying a Guid:

Console.WriteLine(newV5Uuid.ToString() == "de9425a4-e8dd-510b-8e00-b6ac890c733a");
// false

But, if we take the raw bytes in newV5Uuid, we'll see that the pure order of the bytes is exactly the same as the RFC4122-compliant UUID string representation:

var sb = new StringBuilder();
var bytes = newV5Uuid.ToByteArray();
for (int i = 0; i < bytes.Length; ++i)
{
    sb.Append(bytes[i].ToString("x2"))
    if (i == 3 || i == 5 || i == 7 || i == 9)
    {
        sb.Append("-")
    }
}

Console.WriteLine(sb.ToString() == "de9425a4-e8dd-510b-8e00-b6ac890c733a");
// Should output 'true'

The above proves that the two unique identifiers are in fact identical, even if their string representations are different. Admittedly, what System.Guid really needs is an extension method, .ToRfc4122String().

idgen not found in ZSH install

When installing the tool in zsh, idgen does not seem to work.

zsh: command not found: idgen

It does, however, when called from bash.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.