getty-zig / getty Goto Github PK

View Code? Open in Web Editor NEW

174.0 4.0 13.0 154.79 MB

A (de)serialization framework for Zig

Home Page: https://getty.so

License: MIT License

Zig 100.00%

zig getty serialization ziglang

getty's People

Contributors

Stargazers

Watchers

Forkers

demizer veloscillator ion232 codingonion hanaasagi namek polykernel 0xvbetsun lunagl acodervic jamestiotio cloudef sea-grass

getty's Issues

Replace `expect(eql())` with `testing.expectEqualSlices()`

Avoid unnecessary allocations during deserialization

Problem

Currently, allocation during deserialization works like this:

Every pointer value within the final value produced by a visitor will be allocated by the visitor, except for struct field names.

For example:

If i32 is the type being deserialized into, neither the visitor nor the deserializer need to allocate anything, obviously.
If *i32 is the type being deserialized into, the visitor is responsible for allocating the pointer. The deserializer can just pass an integer to the visitor.
If []u8 is the type being deserialized into, the visitor is responsible for allocating the slice. In the case where the deserializer calls visitString, the deserializer can pass in either a string on the heap or a string on the stack. If the passed-in string is on the heap, the deserializer should always deallocate it after its call to visitString.
If struct { x: i32 } is the type being deserialized into, neither the visitor nor the deserializer need to allocate anything. The field name is not a part of the final struct value (it's only used to compare against field names), so it doesn't require allocation.

Generally speaking, this model works well. However, it makes deserialization for certain data structures inefficient. For example, std.BufMap is a wrapper around std.StringHashMap that copies keys and values into the map instead of just moving them in. Under the current allocation model, deserializers would most likely allocate the keys and values (since that's what they generally do for maps), but then the visitor would duplicate those keys and maps, resulting in a second allocation.

Proposal

Should there be a way to let deserializers know when they should or shouldn't allocate the values they pass to a visitor? Or should the default visitors follow the aforementioned allocation model even if it's less efficient, and let users/deserializers handle edge cases?

Alternatives

In the std.BufMap case, a deserializer right now could simply provide its own deserialization block or use unique logic specifically for std.BufMap . But that's clearly not scalable, especially not when more and more types get supported by Getty.

Run/test examples in CI

Pass serializer into aggregate serialization interface methods

Problem

The proposal in #55 got me thinking.

In Getty JSON, the implementation for the aggregate serialization interfaces (e.g., getty.ser.Seq, getty.ser.Map) has a field referencing the serializer passed to serializeSeq, serializeStruct, etc. I believe the field is used in every single one of the aggregate interface methods (no surprise there; most of them are literally called serializeX).

But having to carry this field around in the implementation isn't ideal. Any SB that calls serializeSeq already has a reference to the same serializer, so there's no need to store it in the implementation. Also, it makes more sense to have the serializer as a parameter of the methods doing the serialization than on the impl itself.

Proposal

So, what if we pass in the serializer to the methods like serializeElement, serializeKey, and so on? The SB calling serializeStruct would always have a reference to the relevant serializer so it'd be easy to pass it in for them.

This would simplify implementations of the aggregate interfaces since they'd no longer need to keep a redundant serializer around.

Alternatives

No response

Additional Context

No response

support for integer (non-string) keys in maps

I couldn't decide whether this should be 1 ticket or 2. But my first impression was that serialization generally works and then it turned out I can't move forward because the deserialization doesn't. The latter even requires more than just implementing custom blocks, as mentioned in #37.

Serialization

pub fn main() void {
    const Test = struct {
        some_map: std.AutoHashMap(c_int, c_int),
    };
    var obj: Test = .{
        .some_map = std.AutoHashMap(c_int, c_int).init(std.heap.page_allocator),
    };
    _ = obj.some_map.put(3, 123) catch unreachable;
    var serialized = json.toPrettySlice(gpa.allocator(), obj) catch unreachable;
    std.debug.print("{s}", .{serialized});
}

This prints:

{
  "some_map": {
    3: 123
  }
}

However, the key 3 is invalid because JSON format only allows key names to be strings. The result is supposed to be:

{
  "some_map": {
    "3": 123
  }
}

Deserialization

However, for deserialization of the same struct, like this:

var deserialized = json.fromSlice(std.heap.page_allocator, @TypeOf(obj), "{\"some_map\": {\"3\": 123}}") catch unreachable;
std.debug.print("{any}", .{deserialized});

we get an error:

W:\dev\definitive_dialoguer\.gyro\concepts-ibokuri-github.com-05c73681\pkg\src\lib.zig:75:21: error: concept `StringKey` was not satisfied (expected key type to be `[]const u8`)
            else => @compileError(base ++ extra),
                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
W:\dev\definitive_dialoguer\.gyro\concepts-ibokuri-github.com-05c73681\pkg\src\lib.zig:48:57: note: called from here
                for (results) |result| if (!result) fail(concept, msg);
                                                    ~~~~^~~~~~~~~~~~~~
W:\dev\definitive_dialoguer\.gyro\json-getty-zig-github.com-5753d08c\pkg\src\de\deserializer.zig:326:91: note: called from here
            comptime concepts.Concept("StringKey", "expected key type to be `[]const u8`")(.{
                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^

It is because MapAccess.nextKeySeed has this restriction:

comptime concepts.Concept("StringKey", "expected key type to be `[]const u8`")(.{
    concepts.traits.isSame(@TypeOf(seed).Value, []const u8),
});

As I mentioned in #37 I can't get through this error by implementing a custom deserialization block.

Add tests for `getty.deserialize`

The visitors aren't getting referenced by the tests which is a bit annoying.

What is a deserialized value?

A deserialized value is one that is produced by a deserializer. One important property of a deserialized value is that it must be independent of any deserializer or access type. This means, for instance, deserialized pointers must be allocated ¹.

In other words, if the nextKeySeed function of a MapAccess returns a pointer which points into a JSON token stream maintained by a deserializer, the returned pointer is not a deserialized value since the value's lifetime depends on the lifetime of a deserializer. The correct implementation would be for nextKeySeed to return an allocated string, and for visitors that call the function to assume the returned value is allocated since it is a pointer.

Deserialized pointers could also reference global constants but Getty (specifically, getty.free) assumes that all pointers are allocated since that's a more common/reasonable way to do things. ↩

Do not assume deserialized `struct` keys are allocated

Currently, Getty assumes that all deserialized pointers are allocated. Initially, I felt that this assumption was a reasonable one to make, is consistent, and made Deserializer and Visitor implementations simple.

The issue, however, is that not all deserialized pointers are part of the value produced by a Visitor. For example, during struct deserialization, deserialized keys (i.e., slices) are used only to check against a struct's field name. The key isn't a part of the final struct value at all. In such cases, allocating the key is not only pointless, it's terrible for performance.

The current solution around this is to, in the deserializer implementation, provide a custom deserialization block for struct values. But that's a lot of work that will have to be repeated across libraries and it doesn't change the fact that the default behavior for struct deserialization is slow.

So, I propose that Getty should assume that all deserialized pointers are allocated except during struct key deserialization (struct value deserialization will still assume that pointers are allocated). This will make Deserializer implementations better and faster, and Visitor impls can still be kept simple.

Make the `Attributes` type contain only relevant fields

Statement

The type returned by Attributes contains all fields/variants of the passed-in type.

There are a few problems with this:

If a user specifies attributes for only a few of those fields/variants, the type becomes quite wasteful in terms of storage.
In the serialize function, when serializing a struct or enum, the field serialization loop would be unrolled for every single type in the struct/enum being serialized, which may be quite large. The compiler will prune irrelevant branches but that's still compute that can be avoided.

Proposal

The attribute map type should consist of just the fields specified in attributes, as long as they are all valid.

Implementation

The Attributes function's signature should be changed as follows:

-pub fn Attributes(comptime T: type, attributes: _Attributes(T)) type
+pub fn Attributes(comptime T: type, attributes: anytype) type

Then, Attributes calls another function which takes both of Attributes's parameters as input. This function then goes over each field in attributes and see if T has a corresponding field or if the field is named after T.

fn _Attributes(comptime T: type, attributes: anytype) type {
    if (@typeInfo(@TypeOf(attributes)) != .Struct) {
        @compileError("expected attribute map, found " ++ @typeName(@TypeOf(attributes)));
    }

    // Set up attribute map type
    // ...

    inline for (std.meta.fields(@TypeOf(attributes))) |field| {
        if (std.meta.trait.hasField(field.name) or std.mem.eql(u8, field.name, @typename(T)) {
            // Add appropriate inner attribute struct to our attribute map type
        }

        @compileError("invalid field");
    }

    // Return attribute map type
    // ...
}

Figure out how visitors return errors

The error set of the deserializer is passed into the visitors, but how do the visitors know which error within that set they can return?

Support deserialization into arrays/tuples from slices.

Add deserialization support for extern unions

Should `std.ArrayList`s be serialized as strings?

Type-defined SBTs do not use blocks for inner types

Description

Consider the following, where we have a Point type that contains an ST specifying how to serialize Point and i32 values.

const Point = struct {
    x: i32,
    y: i32,

    pub const @"getty.sbt" = .{ point_sbt, i32_sbt };
};

const point_sbt = struct {
    pub fn is(comptime T: type) bool {
        return T == Point;
    }

    pub fn serialize(value: anytype, serializer: anytype) !@TypeOf(serializer).Ok {
        var s = try serializer.serializeSeq(2);
        const seq = s.seq();

        inline for (std.meta.fields(Point)) |field| {
            try seq.serializeElement(@field(value, field.name));
        }

        return try seq.end();
    }
};

const i32_sbt = struct {
    pub fn is(comptime T: type) bool {
        return T == i32;
    }

    pub fn serialize(value: anytype, serializer: anytype) !@TypeOf(serializer).Ok {
        return try serializer.serializeBool(value > 0);
    }
};

Getty correctly uses point_sbt to serialize Point values. However, it doesn't use i32_sbt to serialize the fields of Point values, resulting in the following, incorrect serialization:

$ zig build run
[0,1]

How to Reproduce the Bug

const std = @import("std");
const getty = @import("getty");
const json = @import("json");

const allocator = std.heap.page_allocator;

const Point = struct {
    x: i32,
    y: i32,

    pub const @"getty.sbt" = .{ point_sbt, i32_sbt };
};

const point_sbt = struct {
    pub fn is(comptime T: type) bool {
        return T == Point;
    }

    pub fn serialize(value: anytype, serializer: anytype) !@TypeOf(serializer).Ok {
        var s = try serializer.serializeSeq(2);
        const seq = s.seq();

        inline for (std.meta.fields(Point)) |field| {
            try seq.serializeElement(@field(value, field.name));
        }

        return try seq.end();
    }
};

const i32_sbt = struct {
    pub fn is(comptime T: type) bool {
        return T == i32;
    }

    pub fn serialize(value: anytype, serializer: anytype) !@TypeOf(serializer).Ok {
        return try serializer.serializeBool(value > 0);
    }
};

pub fn main() !void {
    const point = Point{ .x = 0, .y = 1 };
    const s = try json.toSliceWith(allocator, point, .{});
    defer allocator.free(s);

    std.debug.print("{s}\n", .{s});
}

Additional Context

No response

Remove type-defined tuples

Problem

Currently, type-defined (de)serialization tuples can be created.

However, tuples are for grouping blocks that apply to different types, but type-defined serialization logic can only ever apply to the type that defined it, meaning tuples are useless.

Proposal

Remove type-defined tuples!

Alternatives

No response

Additional Context

No response

Free up resources upon deserialization failure for other compound types

`std.net.Address.resolveIP` doesn't work on Windows

Description

std.net.Address can't be (de)serialized atm on Windows due to an issue in resolveIP (I think). Here's a build that failed due to this issue.

How to Reproduce the Bug

In the net_address.zig block files, remove the enclosing if (builtin.os.tag != .windows) statements from the tests and then run the tests on Windows.

Additional Context

No response

Complete the data model

Rename `getty.de.Map` to `getty.de.MapAccess`

The interface function for getty.de.Map should also be renamed from seq to mapAccess.

Investigate using arena allocators for deserialization

Problem

This is how struct visitors deallocate memory when a deserialization error occurs:

errdefer {
    if (allocator) |alloc| {
        inline for (fields) |field, i| {
            if (!field.is_comptime and seen[i]) {
                de.free(alloc, @field(structure, field.name));
            }
        }
    }
}

And this is how the array visitor does deallocation:

errdefer {
    if (allocator) |alloc| {
        if (array.len > 0) {
            var i: usize = 0;

            while (i < seen) : (i += 1) {
                de.free(alloc, array[i]);
            }
        }
    }
}

Everything is very manual and annoying to write. And even for user-defined types like std.ArrayList, we have to defer to getty.de.free, which itself has to manually walk through the list and frees each child element before freeing the actual list.

Proposal

Wrap the allocator passed to Getty in an arena allocator. Visitors can then just allocate whatever they want, and all the correct allocations will be freed with a simple call to allocator.free.

Alternatives

No response

Additional Context

No response

add `deserializeAny` to the `Deserializer` interface

If you want to deserialize into type T from either a JSON map or a JSON list, you can't really do that right now since you have to specify deserializeMap or deserializeSequence in your deserialize function.

Modify concepts to check required functions as declarations instead

I forgot that using function checks instead of decl checks raises compile errors for things like serializeMap and serializeVoid if they're method implementations are left undefined.

Deserialization of optional struct field with a default value

Description

Zig 0.10.0-master (downloaded yesterday)

The serialization test passes:

test "struct with nullables" {
    const T2 = struct { a: ?u32, b: ?u32, c: ?u32 = null, d: ?u32 = 5 };
    try t(T2{ .a = 1, .b = null }, &[_]Token{
        .{ .Struct = .{ .name = @typeName(T2), .len = 4 } },
        .{ .String = "a" },
        .{ .Some = {} },
        .{ .U32 = 1 },
        .{ .String = "b" },
        .{ .Null = {} },
        .{ .String = "c" },
        .{ .Null = {} },
        .{ .String = "d" },
        .{ .Some = {} },
        .{ .U32 = 5 },
        .{ .StructEnd = {} },
    });
}

but the same code fails for deserialization. In fact, just this part is already broken:

    const T1 = struct { a: ?u32 = 4 };
    try t(T1{ .a = 1 }, &[_]Token{
        .{ .Struct = .{ .name = @typeName(T1), .len = 1 } },
        .{ .String = "a" },
        .{ .Some = {} },
        .{ .U32 = 1 },
        .{ .StructEnd = {} },
    });

The error:

W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-d032ff42\pkg\src\de\impls\visitor\struct.zig:72:61: error: expected type '?u32', found '*const anyopaque'
                            @field(structure, field.name) = default;
                                                            ^~~~~~~
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-d032ff42\pkg\src\lib.zig:41:9: error: concept `getty.Deserializer` was not satisfied: missing `context` field
        @compileError("concept `" ++ concept ++ "` was not satisfied: " ++ msg);
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-d032ff42\pkg\src\de\concepts\deserializer.zig:10:25: note: called from here
            concepts.err(concept, "missing `context` field");
            ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-d032ff42\pkg\src\de\interfaces\visitor.zig:137:47: note: called from here
                concepts.@"getty.Deserializer"(@TypeOf(deserializer));
                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

I went on patching the src/de/impls/visitor/struct.zig:72:

@field(structure, field.name) = default;

⬇

@field(structure, field.name) = @as(field.field_type, default);

but then I get another result that I don't know how to fix, also it feels like a compiler bug?

All 21 tests passed.
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-d032ff42\pkg\src\de\impls\visitor\struct.zig:35:28: error: expected type '?u32', found '*const anyopaque'
                    inline for (fields) |field, i| {
                    ~~~~~~~^~~
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-d032ff42\pkg\src\lib.zig:41:9: error: concept `getty.Deserializer` was not satisfied: missing `context` field
        @compileError("concept `" ++ concept ++ "` was not satisfied: " ++ msg);
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-d032ff42\pkg\src\de\concepts\deserializer.zig:10:25: note: called from here
            concepts.err(concept, "missing `context` field");
            ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-d032ff42\pkg\src\de\interfaces\visitor.zig:137:47: note: called from here
                concepts.@"getty.Deserializer"(@TypeOf(deserializer));
                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

The mentioned de/impls/visitor/struct.zig:

errdefer {
    if (allocator) |alloc| {
        inline for (fields) |field, i| {     // <--- line 35 from the error
            if (!field.is_comptime and seen[i]) {
                getty.de.free(alloc, @field(structure, field.name));
            }
        }
    }
}

How to Reproduce the Bug

add deserialization test to src/tests/ser/tests.zig and run it with zig build test in the main folder:

test "struct with nullables" {
    const T1 = struct { a: ?u32 = 4 };
    try t(T1{ .a = 1 }, &[_]Token{
        .{ .Struct = .{ .name = @typeName(T1), .len = 1 } },
        .{ .String = "a" },
        .{ .Some = {} },
        .{ .U32 = 1 },
        .{ .StructEnd = {} },
    });

    const T2 = struct { a: ?u32, b: ?u32, c: ?u32 = null, d: ?u32 = 5 };
    try t(T2{ .a = 1, .b = null }, &[_]Token{
        .{ .Struct = .{ .name = @typeName(T2), .len = 4 } },
        .{ .String = "a" },
        .{ .Some = {} },
        .{ .U32 = 1 },
        .{ .String = "b" },
        .{ .Null = {} },
        .{ .String = "c" },
        .{ .Null = {} },
        .{ .String = "d" },
        .{ .Some = {} },
        .{ .U32 = 5 },
        .{ .StructEnd = {} },
    });
}

Additional Context

No response

Support deserialization into strings from chars

Add deserialization support for packed unions

LLVM error when calling undefined serializer method implementation

Description

If a Getty serializer has a method that is unimplemented (i.e., undefined) and it is called, the following LLVM error is raised:

LLVM Emit Object... LLVM ERROR: Cannot select: 0x7ff488985fb8: ch,glue = X86ISD::CALL 0x7ff48a18d208, undef:ch, Register:i64 $rdi, Register:i32 $esi, RegisterMask:Untyped, 0x7ff48a18d208:1, ser/interfaces/serializer.zig:239:48
  0x7ff488985ba8: i64 = Register $rdi
  0x7ff4882d7db0: i32 = Register $esi
  0x7ff488985800: Untyped = RegisterMask
  0x7ff48a18d208: ch,glue = CopyToReg 0x7ff488985868, Register:i32 $esi, 0x7ff488986498, 0x7ff488985868:1, ser/interfaces/serializer.zig:239:48
    0x7ff4882d7db0: i32 = Register $esi
    0x7ff488986498: i32 = any_extend 0x7ff488985f50, ser/interfaces/serializer.zig:239:48
      0x7ff488985f50: i8,ch = CopyFromReg 0x7ff489330868, Register:i8 %3, ser/interfaces/serializer.zig:239:48
        0x7ff488986430: i8 = Register %3
    0x7ff488985868: ch,glue = CopyToReg 0x7ff48aa3eaa0, Register:i64 $rdi, 0x7ff4882d85d0, ser/interfaces/serializer.zig:239:48
      0x7ff488985ba8: i64 = Register $rdi
      0x7ff4882d85d0: i64,ch = CopyFromReg 0x7ff489330868, Register:i64 %1, ser/interfaces/serializer.zig:239:48
        0x7ff48a18d000: i64 = Register %1
In function: ser.interfaces.serializer.Serializer(main.Serializer,void,error{Io,Syntax},.{ser.blocks.array_list, ser.blocks.bounded_array, ser.blocks.hash_map, ser.blocks.linked_list, ser.blocks.tail_queue, ser.blocks.array, ser.blocks.bool, ser.blocks.enum, ser.blocks.error, ser.blocks.float, ser.blocks.int, ser.blocks.null, ser.blocks.optional, ser.blocks.pointer, ser.blocks.slice, ser.blocks.string, ser.blocks.struct, ser.blocks.tuple, ser.blocks.union, ser.blocks.vector, ser.blocks.void},.{ser.blocks.array_list, ser.blocks.bounded_array, ser.blocks.hash_map, ser.blocks.linked_list, ser.blocks.tail_queue, ser.blocks.array, ser.blocks.bool, ser.blocks.enum, ser.blocks.error, ser.blocks.float, ser.blocks.int, ser.blocks.null, ser.blocks.optional, ser.blocks.pointer, ser.blocks.slice, ser.blocks.string, ser.blocks.struct, ser.blocks.tuple, ser.blocks.union, ser.blocks.vector, ser.blocks.void},lib.TODO,lib.TODO,lib.TODO,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined).getty.Serializer.serializeBool
error: test...
error: The following command terminated unexpectedly:
[...]

Before, we would get a nice "use of undefined value" compile error. And I believe that's still the case with deserializers. I'm not sure if this has something to do with serializers or if it's a Zig issue.

How to Reproduce the Bug

Run the following:

const std = @import("std");
const getty = @import("getty");

const Serializer = struct {
    pub usingnamespace getty.Serializer(
        @This(),
        Ok,
        Error,
        getty.default_st,
        getty.default_st,
        getty.TODO,
        getty.TODO,
        getty.TODO,
        undefined,
        undefined,
        undefined,
        undefined,
        undefined,
        undefined,
        undefined,
        undefined,
        undefined,
        undefined,
        undefined,
    );

    const Ok = void;
    const Error = error{ Io, Syntax };
};

pub fn main() anyerror!void {
    var s = Serializer{};
    const serializer = s.serializer();

    try getty.serialize(true, serializer);
}

Additional Context

No response

Make attribute setting more ergonomic

Bad

const Point = struct {
    usingnamespace Attributes(@This(), .ser, .{ .x = .{ .rename = "a" } });
    usingnamespace Attributes(@This(), .de, .{ .x = .{ .rename = "de_a" } });

    x: i32,
    y: i32,
};

Good

const Point = struct {
    usingnamespace Attributes(@This(), .{ .x = .{ .rename = "a", .rename_deserialize = "de_a" } });

    x: i32,
    y: i32,
};

Rename `getty.de.Seq` to `getty.de.SeqAccess`

The interface function for getty.de.Seq should also be renamed from seq to seqAccess.

Inferred error sets in required method implementations crashes stage2 compiler

Description

On zig master (first seen in 0.10.0-dev.3978+4fd4c733d), stage2 compilations for some programs implementing getty.Serializer (and getty.Deserializer probably) are ~~going into an infinite loop and running forever~~ crashing.

The problem seems to occur when a method implementation is provided for a required method that contains an anytype parameter (e.g., serializeFloat, serializeInt). This causes the ~~infinite loop~~ crash. Providing implementations for methods like serializeBool or serializeNull still works completely fine.

How to Reproduce the Bug

Works

const std = @import("std");
const getty = @import("getty");

const Serializer = struct {
    pub usingnamespace getty.Serializer(
        @This(),
        Ok,
        Error,
        null,
        null,
        null,
        null,
        null,
        .{
            .serializeBool = serializeBool,
        },
    );

    const Ok = void;
    const Error = getty.ser.Error || error{Foobar};

    fn serializeBool(_: @This(), value: bool) !Ok {
        std.debug.print("{}\n", .{value});
    }
};

pub fn main() anyerror!void {
    var s = Serializer{};
    const serializer = s.serializer();

    try getty.serialize(null, true, serializer);
}

Doesn't Work

const std = @import("std");
const getty = @import("getty");

const Serializer = struct {
    pub usingnamespace getty.Serializer(
        @This(),
        Ok,
        Error,
        null,
        null,
        null,
        null,
        null,
        .{
            .serializeInt = serializeInt,
        },
    );

    const Ok = void;
    const Error = getty.ser.Error || error{Foobar};

    fn serializeInt(_: @This(), value: anytype) !Ok {
        std.debug.print("{}\n", .{value});
    }
};

pub fn main() anyerror!void {
    var s = Serializer{};
    const serializer = s.serializer();

    try getty.serialize(null, 123, serializer);
}

$ zig build run
zig build-exe test Debug native: error: thread 431519 panic: attempt to unwrap error
Unable to dump stack trace: debug info stripped

zig build-exe test Debug native: error: the following command terminated unexpectedly:
/Users/jason/.asdf/installs/zig/master/zig build-exe /Users/jason/Projects/Personal/test/src/main.zig --cache-dir /Users/jason/Projects/Personal/test/zig-cache --global-cache-dir /Users/jason/.cache/zig --name test --mod getty::/Users/jason/.cache/zig/p/1220c649e62d6bb0e07ac6b02711f76254f1dd1875859ad075873d248fb820f68d66/src/getty.zig --deps getty --listen=-
Build Summary: 0/5 steps succeeded; 1 failed (disable with -fno-summary)
run transitive failure
└─ run test transitive failure
   ├─ zig build-exe test Debug native failure
   └─ install transitive failure
      └─ install test transitive failure
         └─ zig build-exe test Debug native (reused)
error: the following build command failed with exit code 1:
/Users/jason/Projects/Personal/test/zig-cache/o/ee1be17a562f060aa4ae0c496b5dcfcc/build /Users/jason/.asdf/installs/zig/master/zig /Users/jason/Projects/Personal/test /Users/jason/Projects/Personal/test/zig-cache /Users/jason/.cache/zig run

Additional Context

Compiling with stage1 fixes the problem. So if you need a workaround for now, the following should work for you:

zig build -fstage1 run

Repalce `usingnamespace` since it's getting removed

Deserialization of tuples of tuples no longer works

Description

Deserializing tuples of tuples seems to break the compiler now.

How to Reproduce the Bug

Uncomment the relevant test in the "tuple" test in src/tests/de/tests.zig and run zig build test.

Additional Context

This is due to stage2.

error: values of type 'anyopaque' must be comptime known, but operand value is runtime known

Description

const Omg = struct {
    children: std.AutoArrayHashMap(u32, u32),
};

W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-8be90217\pkg\src\de.zig:188:46: error: values of type 'anyopaque' must be comptime known, but operand value is runtime known
                        free(allocator, value.*);
                                        ~~~~~^~
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-8be90217\pkg\src\de.zig:188:46: note: opaque type 'anyopaque' has undefined size
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-8be90217\pkg\src\de.zig:188:46: error: values of type 'fn(*anyopaque, usize, u29, u29, usize) error{OutOfMemory}![]u8' must be comptime known, but operand value is runtime known
                        free(allocator, value.*);
                                        ~~~~~^~
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-8be90217\pkg\src\de.zig:188:46: note: use '*const fn(*anyopaque, usize, u29, u29, usize) error{OutOfMemory}![]u8' for a function pointer type
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-8be90217\pkg\src\de.zig:188:46: error: values of type 'fn(*anyopaque, []u8, u29, usize, u29, usize) ?usize' must be comptime known, but operand value is runtime known
                        free(allocator, value.*);
                                        ~~~~~^~
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-8be90217\pkg\src\de.zig:188:46: note: use '*const fn(*anyopaque, []u8, u29, usize, u29, usize) ?usize' for a function pointer type
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-8be90217\pkg\src\de.zig:188:46: error: values of type 'fn(*anyopaque, []u8, u29, usize) void' must be comptime known, but operand value is runtime known
                        free(allocator, value.*);
                                        ~~~~~^~
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-8be90217\pkg\src\de.zig:188:46: note: use '*const fn(*anyopaque, []u8, u29, usize) void' for a function pointer type
W:\dev\definitive_dialoguer\.gyro\concepts-ibokuri-github.com-05c73681\pkg\src\lib.zig:75:21: error: concept `StringKey` was not satisfied (expected key type to be `[]const u8`)
            else => @compileError(base ++ extra),
                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
W:\dev\definitive_dialoguer\.gyro\concepts-ibokuri-github.com-05c73681\pkg\src\lib.zig:48:57: note: called from here
                for (results) |result| if (!result) fail(concept, msg);
                                                    ~~~~^~~~~~~~~~~~~~
W:\dev\definitive_dialoguer\.gyro\json-getty-zig-github.com-5753d08c\pkg\src\de\deserializer.zig:323:91: note: called from here
            comptime concepts.Concept("StringKey", "expected key type to be `[]const u8`")(.{
                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^

The deserialization block:

pub const de_AutoArrayHashMap = struct {
    pub fn is(comptime T: type) bool {
        @compileLog(@typeName(T));
        return comptime std.mem.startsWith(u8, @typeName(T), "array_hash_map");
    }

    pub const Visitor = AutoArrayHashMapVisitor;
    pub const deserialize = getty.de.blocks.HashMap.deserialize;
};


pub fn AutoArrayHashMapVisitor(comptime HashMap: type) type {
    return struct {
        const Self = @This();

        pub usingnamespace getty.de.Visitor(
            Self,
            Value,
            undefined,
            undefined,
            undefined,
            undefined,
            visitMap,
            undefined,
            undefined,
            undefined,
            undefined,
            undefined,
            undefined,
        );

        const Value = HashMap;

        fn visitMap(_: Self, allocator: ?std.mem.Allocator, comptime Deserializer: type, map: anytype) Deserializer.Error!Value {
            const K = std.meta.fieldInfo(Value.KV, .key).field_type;
            const V = std.meta.fieldInfo(Value.KV, .value).field_type;
            const unmanaged = comptime std.mem.startsWith(
                u8,
                @typeName(Value),
                "array_hash_map.ArrayHashMapUnmanaged",
            );
            @compileLog(K, V, Value, HashMap);

            var hash_map = if (unmanaged) HashMap{} else HashMap.init(allocator.?);
            errdefer getty.de.free(allocator.?, hash_map);

            while (try map.nextKey(allocator, K)) |key| {
                errdefer getty.de.free(allocator.?, key);

                const value = try map.nextValue(allocator, V);
                errdefer getty.de.free(allocator.?, value);

                try if (unmanaged) hash_map.put(allocator.?, key, value) else hash_map.put(key, value);
            }

            return hash_map;
        }
    };
}

How to Reproduce the Bug

const Omg = struct {
    children: std.AutoArrayHashMap(u32, u32),
};

_ = try json.fromSliceWith(null, Omg, file_data, model_serialization.deserialization_blocks);

Additional Context

No response

Have `serializeSequence` return a end function

Type-defined (De)Serialization Blocks/Tuples

Problem

Say you've written a library with the following type:

const Point = struct {
	  x: i32,
	  y: i32,
};

Now, Point is already serializable by Getty, so users can immediately serialize it into JSON without having to provide a serialization block/tuple. However, suppose that you, the author of the library, want Point to be serialized as a JSON list by default, not an object. Currently, there's no way to do this. There's no way to specify a default (de)serialization behavior for a type.

Instead, what you have to do is define a public SBT for Point that users can import. There are two main problems with this:

Users would have to check every library they use for SBTs beforehand to ensure that they're serializing types correctly.
If Point contained a type from another library, which itself has an associated SBT, the user would have to 1) know about the second SBT and 2) provide both SBTs to Getty in order to properly serialize a Point value.

These two problems make Getty completely impractical for real-world use whenever packages become involved.

Proposal

Type-Defined Blocks/Tuples

Currently, there are only three entities that can define blocks and tuples (BT): Getty, (de)serializers, and users. This proposal adds a fourth entity: types. Specifically, any type that can define declarations (i.e., struct, enum, union).

Naming

To avoid name conflicts, a type-defined BT must be named either @"getty.sbt" or @"getty.dbt", depending on whether it's intended for serialization or deserialization.

Selection Process

With type-defined BTs, here's the new selection process Getty will go through:

Select a matching user-defined BT, if any exist.
Select a matching BT in the type being (de)serialized, if any exist.
Select a matching (de)serializer-defined BT, if any exist.
Select a matching Getty-defined BT, if any exist.

Example

const std = @import("std");
const json = @import("json");

const Point = struct {
    x: i32,
    y: i32,

    pub const @"getty.sbt" = .{struct {
        pub fn is(comptime T: type) bool {
            return T == Point;
        }

        pub fn serialize(value: anytype, serializer: anytype) !@TypeOf(serializer).Ok {
            var s = try serializer.serializeSeq(2);
            const seq = s.seq();

            try seq.serializeElement(value.x);
            try seq.serializeElement(value.y);

            return try seq.end();
        }
    }};
};

pub fn main() anyerror!void {
    const string = try json.toSlice(std.heap.page_allocator, Point{ .x = 1, .y = 2 });
    std.debug.print("{s}\n", .{string});
}

Alternatives

No response

Additional Context

No response

Use external tagging for union serialization

Problem

Currently, (tagged) unions are serialized as their payload type. That is, a value of type union { foo: i32} would be serialized the same as an i32 would. There are two problems with this:

There is no way to properly deserialize such serialized values back into a union (tagged or untagged). The best you can do is try each variant and use the one that fits. Of course, you'd run into trouble if two variants have the same or similar types.
Another problem is that unions are currently deserialized using external tagging. That is, Getty expects serialized unions to be a map like { "foo": 123 } or a string for void variants. There's an inconsistency between serialization and deserialization.

Proposal

Unions should be serialized using the externally tagged format. Extra formats (e.g., internally tagged) may be provided as well in the future.

Alternatives

No response

Additional Context

No response

Simplify `getty.ser.serialize`'s return type

Doesn't make much sense to check that the serializer parameter is a pointer in a function called SerializerErrorUnion. So, the return type should instead be this:

pub fn serialize(serializer: anytype, value: anytype) switch (@typeInfo(@TypeOf(serializer))) {
    .Pointer => @typeInfo(T).Pointer.child.Error!@typeInfo(T).Pointer.child.Ok,
    else => @compileError("expected pointer to serializer, found " ++ @typeName(T)),
} {
    // ...
}

Handle failures for testing functions

Starting from Zig version 67154d2, std.testing.expect* functions can fail, so they need to be updated with try.

Remove `is` function in type-defined blocks

Problem

Type-defined blocks currently need an is function to be specified, like all SBs. However, they're unnecessary since TBs only apply to a single type.

Proposal

Remove the need for an is function in type-defined blocks!

Alternatives

No response

Additional Context

No response

Implement `deserializeEnum` for test deserializer

Return interface value from aggregate serialization methods

Problem

Currently, methods like serializeSeq return a value of a type that implements getty.ser.Seq.

I think it'd be better if those methods returned the interface value instead

As it is now, blocks have to do the whole s.seq() thing even though the blocks always use the return value of serializeSeq solely for its interface behavior. They never care about the actual implementation, so they shouldn't have access to it in the first place.
It'd be more consistent with how deserializers work, which pass in the interface values for, say, getty.de.SeqAccess to visitors.

Proposal

Serializers should pass in Seq.@"getty.ser.Seq for the Seq parameter in their call to getty.Serializer (and Getty should check at comptime that it's a getty.ser.Seq). With that, I think Getty should be able to use it as the return value for serializeSeq.

This would make it to where serializers are the ones writing s.seq() instead of blocks, which is better since the serializers are the ones defining the implementations and so it's fine if they can look at the implementation's details and guts. The blocks shouldn't have access to that stuff though.

Alternatives

No response

Additional Context

No response

Add one pointer deserialization support

Change `getty.deserialize` parameter order

Have deserializer come before T.

Make all types implicitly (de)serializable

The interface approach to (de)serializables is a bit useless.

Unlike in Rust where you can implement Serialize and Deserialize on primitive and user data types, you can only do the latter in Zig. One consequence of this is that primitives are assumed to be implicitly (de)serializable by Getty. In contrast, user data types are not assumed to be implicitly (de)serializable.

This is a problem for a few reasons:

It's inconsistent. Either all types must be explicitly denoted as (de)serializable or they should all be implicitly assumed to be so.
It's practically useless, since functions can't take Serialize or Deserialize parameters as that would only allow structs and enums.
You can't implement Serialize or Deserialize for data types outside of your code (e.g., ArrayList).
Interfaces are general specifications of behavior. But Getty provides Serialize and Deserialize implementations for any user data type (comptime ftw). There's no need for an interface when we literally just implement the behavior. Thus, user-provided implementations simply become a mechanism for customization, which can easily be handled by something like std.meta.hasFn.

The proposal is to make all types implicitly (de)serializable. Everything pretty much stays the same, except now there's no interface to implement.

const getty_json = @import("getty_json");
const std = @import("std");

const allocator = std.testing.allocator;
const print = std.debug.print;

const Point = struct {
    x: i32,
    y: i32,
};

pub const main = fn() anyerror!void {
    const point = Point{ .x = 1, .y = 2 };
    const json = try getty_json.toArrayList(allocator, point);
    defer json.deinit();

    print("{s}\n", .{json.items}); // {"x":1,"y":2}
}

For attributes, it would largely be the same, except you use an Attributes function:

const getty = @import("getty");

const Point = struct {
    usingnamespace getty.Attributes(.{ .x = .{ .rename = "a" } });

    x: i32,
    y: i32,
};

Publish to aquila.red

https://aquilla.red is a package index for zigmod

This would make it simple to add this package to a project via zigmod:

zigmod aq add 1/getty-zig/getty
zigmod fetch

Add serialization support for unions

Make `isZigString` call `comptime`

In the following code, trait.isZigString(T) should be made comptime.

getty/src/ser.zig

Line 50 in 721e88d

 if (info.child == u8 and trait.isZigString(T) and unicode.utf8ValidateSlice(value)) { 

The issue I think is that the comptime applies to the next condition as well, which throws up an error. Need to figure out a clean way around that.

Simplify Serialize's and Deserialize's required method signature

Currently, to implement Serialize and Deserialize, you need the following:

fn serialize(self: Self, comptime S: type, serializer: *S) S.Error!S.Ok
fn deserialize(self: Self, comptime D: type, deserializer: *D) S.Error!S.Ok

We can simplify this by using anytype since that's basically just sugar for what we have above:

const std = @import("std");

fn ErrorUnion(comptime T: type) type {
    return switch (@typeInfo(T)) {
        .Pointer => std.meta.Child(T).Error!std.meta.Child(T).Ok,
        else => @compileError("expected pointer, found " ++ @typeName(T)),
    };
}

fn serialize(self: Self, serializer: anytype) ErrorUnion(@TypeOf(serializer))
fn deserialize(self: Self, deserializer: anytype) ErrorUnion(@TypeOf(deserializer))

We can also do the same for getty.serialize.

This might also let us use the Allocator interface design, but I'm not sure.

Edit: Being able to use the Allocator design might not even make sense for Serialize and Deserialize, since being able to pass around Serializes or Deserializes is practically useless since you can't pass around primitives. So maybe we just simplify and then continue using the writer/reader design.

Make required methods optional

Problem

In order to signal that a required method of a Getty interface is not implemented, the implementing type must currently pass undefined into the interface for that required method. Obviously, this isn't ideal.

The best option would be to pass in null for unimplemented methods. In fact, that's what Getty used to do. However, it was found that Zig would crash whenever an unimplemented generic method was called. And so, Getty transitioned from optional required methods to mandatory ones.

However! I just tried out optional required methods on a whim today, and it looks like they may be possible now!

Proposal

Required method parameters in Getty interfaces can be marked as optional. For example,

fn BoolSerializer(
    comptime Context: type,
    comptime O: type,
    comptime E: type,
    comptime serializeBoolFn: ?fn (Context, anytype) E!O,
) type

Then, the interface's implementation would looks something like this:

fn BoolSerializer(
    comptime Context: type,
    comptime O: type,
    comptime E: type,
    comptime serializeBoolFn: ?fn (Context, anytype) E!O,
) type {
    return struct {
        pub const Interface = struct {
            context: Context,

            pub const Ok = O;
            pub const Error = E;

            pub fn serializeBool(self: @This(), value: anytype) Error!Ok {
                if (serializeBoolFn) |f| {
                    return f(self.context, value);
                }

				@compileError("unimplemented"); // This is only raised if serializeBoolFn is null!
            }
        };

        pub fn boolSerializer(self: Context) Interface {
            return .{ .context = self };
        }
    };
}

The implementation side of things would remain the same, except they can now pass in null instead of undefined for unimplemented required methods.

No idea why this all works now, and I'll have to do a bit more testing to see whether this works for everything (e.g., deserialization), but I'm really excited about this!

Add deserialization support for unions

I feel like getty is really missing a visitor for tagged unions. I think this could be done similar to the default behaviour of serde.

Attributes for aggregate types

Problem

Currently, all modifications to the (de)serialization process for a type is required to be specified in the serialize and deserialize + Visitor functions of SBTs and DBTs, respectively.

This includes even small modifications. For example, if you want to skip a field when serializing a struct, you need to manually write out the entire sequence serialization process in serialize. Modifications like these are common enough that users shouldn't have to rewrite them every single time.

Proposal

A new declaration called attributes will be added to SBTs and DBTs.

The declaration will be mutually exclusive with all existing SBT/DBT declarations except is (i.e., serialize, deserialize, and Visitor). In other words, you'll have two options when it comes to customizing Getty's behavior now:

Use a combination of attributes to get the behavior you want.
Specify the behavior yourself if attributes can't cover all your needs.

Attribute Types

There are three kinds of attributes a user can specify: field, variant, and container. Field attributes correspond to the fields of a struct, variant attributes correspond to the variants of an enum or union, and container attributes correspond to the struct, enum, or union itself.

Attribute Declaration

The attributes declaration is a struct, where each field matches either the word Container or a field/variant name in the type being (de)serialized. For example:

const Point = struct {
    x: i32,
    y: i32,

    pub const @"getty.sbt" = struct {
		pub fn is(comptime T: type) bool {
			return T == Point;
		}

        pub const attributes = .{
            .Container = .{ .name = "Coordinate" },
            .x = .{ .name = "X" },
            .y = .{ .skip = true },
        };
    };
};

SBT/DBT Validation

If an SBT or DBT does not specify attributes, then validation is the same (i.e., the interface validates them). Otherwise, the validation will be deferred until Getty actually performs (de)serialization (i.e., when it's choosing SBTs/DBTs).

This enables the attributes declaration to remain a simple anonymous struct and avoids requiring the user to pass in the type that they're trying to (de)serialize as part of the attributes declaration, which becomes an issue for anonymous and generic types.

getty-zig / getty Goto Github PK

getty's People

Contributors

Stargazers

Watchers

Forkers

getty's Issues

Problem

Proposal

Alternatives

Problem

Proposal

Alternatives

Additional Context

Serialization

Deserialization

Footnotes

Statement

Proposal

Implementation

Description

How to Reproduce the Bug

Additional Context

Problem

Proposal

Alternatives

Additional Context

Description

How to Reproduce the Bug

Additional Context

Problem

Proposal

Alternatives

Additional Context

Description

How to Reproduce the Bug

Additional Context

Description

How to Reproduce the Bug

Additional Context

Bad

Good

Description

How to Reproduce the Bug

Works

Doesn't Work

Additional Context

Description

How to Reproduce the Bug

Additional Context

Description

How to Reproduce the Bug

Additional Context

Problem

Proposal

Type-Defined Blocks/Tuples

Naming

Selection Process

Example

Alternatives

Additional Context

Problem

Proposal

Alternatives

Additional Context

Problem

Proposal

Alternatives

Additional Context

Problem

Proposal

Alternatives

Additional Context

Problem

Proposal

Problem

Proposal

Attribute Types

Attribute Declaration

SBT/DBT Validation