getty-zig / getty Goto Github PK
View Code? Open in Web Editor NEWA (de)serialization framework for Zig
Home Page: https://getty.so
License: MIT License
A (de)serialization framework for Zig
Home Page: https://getty.so
License: MIT License
Currently, (tagged) unions are serialized as their payload type. That is, a value of type union { foo: i32}
would be serialized the same as an i32
would. There are two problems with this:
{ "foo": 123 }
or a string for void
variants. There's an inconsistency between serialization and deserialization.Unions should be serialized using the externally tagged format. Extra formats (e.g., internally tagged) may be provided as well in the future.
No response
No response
Doesn't make much sense to check that the serializer
parameter is a pointer in a function called SerializerErrorUnion
. So, the return type should instead be this:
pub fn serialize(serializer: anytype, value: anytype) switch (@typeInfo(@TypeOf(serializer))) {
.Pointer => @typeInfo(T).Pointer.child.Error!@typeInfo(T).Pointer.child.Ok,
else => @compileError("expected pointer to serializer, found " ++ @typeName(T)),
} {
// ...
}
Currently, allocation during deserialization works like this:
Every pointer value within the final value produced by a visitor will be allocated by the visitor, except for struct field names.
For example:
i32
is the type being deserialized into, neither the visitor nor the deserializer need to allocate anything, obviously.*i32
is the type being deserialized into, the visitor is responsible for allocating the pointer. The deserializer can just pass an integer to the visitor.[]u8
is the type being deserialized into, the visitor is responsible for allocating the slice. In the case where the deserializer calls visitString
, the deserializer can pass in either a string on the heap or a string on the stack. If the passed-in string is on the heap, the deserializer should always deallocate it after its call to visitString
.struct { x: i32 }
is the type being deserialized into, neither the visitor nor the deserializer need to allocate anything. The field name is not a part of the final struct value (it's only used to compare against field names), so it doesn't require allocation.Generally speaking, this model works well. However, it makes deserialization for certain data structures inefficient. For example, std.BufMap
is a wrapper around std.StringHashMap
that copies keys and values into the map instead of just moving them in. Under the current allocation model, deserializers would most likely allocate the keys and values (since that's what they generally do for maps), but then the visitor would duplicate those keys and maps, resulting in a second allocation.
Should there be a way to let deserializers know when they should or shouldn't allocate the values they pass to a visitor? Or should the default visitors follow the aforementioned allocation model even if it's less efficient, and let users/deserializers handle edge cases?
In the std.BufMap
case, a deserializer right now could simply provide its own deserialization block or use unique logic specifically for std.BufMap
. But that's clearly not scalable, especially not when more and more types get supported by Getty.
Consider the following, where we have a Point
type that contains an ST specifying how to serialize Point
and i32
values.
const Point = struct {
x: i32,
y: i32,
pub const @"getty.sbt" = .{ point_sbt, i32_sbt };
};
const point_sbt = struct {
pub fn is(comptime T: type) bool {
return T == Point;
}
pub fn serialize(value: anytype, serializer: anytype) !@TypeOf(serializer).Ok {
var s = try serializer.serializeSeq(2);
const seq = s.seq();
inline for (std.meta.fields(Point)) |field| {
try seq.serializeElement(@field(value, field.name));
}
return try seq.end();
}
};
const i32_sbt = struct {
pub fn is(comptime T: type) bool {
return T == i32;
}
pub fn serialize(value: anytype, serializer: anytype) !@TypeOf(serializer).Ok {
return try serializer.serializeBool(value > 0);
}
};
Getty correctly uses point_sbt
to serialize Point
values. However, it doesn't use i32_sbt
to serialize the fields of Point
values, resulting in the following, incorrect serialization:
$ zig build run
[0,1]
const std = @import("std");
const getty = @import("getty");
const json = @import("json");
const allocator = std.heap.page_allocator;
const Point = struct {
x: i32,
y: i32,
pub const @"getty.sbt" = .{ point_sbt, i32_sbt };
};
const point_sbt = struct {
pub fn is(comptime T: type) bool {
return T == Point;
}
pub fn serialize(value: anytype, serializer: anytype) !@TypeOf(serializer).Ok {
var s = try serializer.serializeSeq(2);
const seq = s.seq();
inline for (std.meta.fields(Point)) |field| {
try seq.serializeElement(@field(value, field.name));
}
return try seq.end();
}
};
const i32_sbt = struct {
pub fn is(comptime T: type) bool {
return T == i32;
}
pub fn serialize(value: anytype, serializer: anytype) !@TypeOf(serializer).Ok {
return try serializer.serializeBool(value > 0);
}
};
pub fn main() !void {
const point = Point{ .x = 0, .y = 1 };
const s = try json.toSliceWith(allocator, point, .{});
defer allocator.free(s);
std.debug.print("{s}\n", .{s});
}
No response
const Omg = struct {
children: std.AutoArrayHashMap(u32, u32),
};
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-8be90217\pkg\src\de.zig:188:46: error: values of type 'anyopaque' must be comptime known, but operand value is runtime known
free(allocator, value.*);
~~~~~^~
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-8be90217\pkg\src\de.zig:188:46: note: opaque type 'anyopaque' has undefined size
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-8be90217\pkg\src\de.zig:188:46: error: values of type 'fn(*anyopaque, usize, u29, u29, usize) error{OutOfMemory}![]u8' must be comptime known, but operand value is runtime known
free(allocator, value.*);
~~~~~^~
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-8be90217\pkg\src\de.zig:188:46: note: use '*const fn(*anyopaque, usize, u29, u29, usize) error{OutOfMemory}![]u8' for a function pointer type
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-8be90217\pkg\src\de.zig:188:46: error: values of type 'fn(*anyopaque, []u8, u29, usize, u29, usize) ?usize' must be comptime known, but operand value is runtime known
free(allocator, value.*);
~~~~~^~
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-8be90217\pkg\src\de.zig:188:46: note: use '*const fn(*anyopaque, []u8, u29, usize, u29, usize) ?usize' for a function pointer type
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-8be90217\pkg\src\de.zig:188:46: error: values of type 'fn(*anyopaque, []u8, u29, usize) void' must be comptime known, but operand value is runtime known
free(allocator, value.*);
~~~~~^~
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-8be90217\pkg\src\de.zig:188:46: note: use '*const fn(*anyopaque, []u8, u29, usize) void' for a function pointer type
W:\dev\definitive_dialoguer\.gyro\concepts-ibokuri-github.com-05c73681\pkg\src\lib.zig:75:21: error: concept `StringKey` was not satisfied (expected key type to be `[]const u8`)
else => @compileError(base ++ extra),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~
W:\dev\definitive_dialoguer\.gyro\concepts-ibokuri-github.com-05c73681\pkg\src\lib.zig:48:57: note: called from here
for (results) |result| if (!result) fail(concept, msg);
~~~~^~~~~~~~~~~~~~
W:\dev\definitive_dialoguer\.gyro\json-getty-zig-github.com-5753d08c\pkg\src\de\deserializer.zig:323:91: note: called from here
comptime concepts.Concept("StringKey", "expected key type to be `[]const u8`")(.{
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
The deserialization block:
pub const de_AutoArrayHashMap = struct {
pub fn is(comptime T: type) bool {
@compileLog(@typeName(T));
return comptime std.mem.startsWith(u8, @typeName(T), "array_hash_map");
}
pub const Visitor = AutoArrayHashMapVisitor;
pub const deserialize = getty.de.blocks.HashMap.deserialize;
};
pub fn AutoArrayHashMapVisitor(comptime HashMap: type) type {
return struct {
const Self = @This();
pub usingnamespace getty.de.Visitor(
Self,
Value,
undefined,
undefined,
undefined,
undefined,
visitMap,
undefined,
undefined,
undefined,
undefined,
undefined,
undefined,
);
const Value = HashMap;
fn visitMap(_: Self, allocator: ?std.mem.Allocator, comptime Deserializer: type, map: anytype) Deserializer.Error!Value {
const K = std.meta.fieldInfo(Value.KV, .key).field_type;
const V = std.meta.fieldInfo(Value.KV, .value).field_type;
const unmanaged = comptime std.mem.startsWith(
u8,
@typeName(Value),
"array_hash_map.ArrayHashMapUnmanaged",
);
@compileLog(K, V, Value, HashMap);
var hash_map = if (unmanaged) HashMap{} else HashMap.init(allocator.?);
errdefer getty.de.free(allocator.?, hash_map);
while (try map.nextKey(allocator, K)) |key| {
errdefer getty.de.free(allocator.?, key);
const value = try map.nextValue(allocator, V);
errdefer getty.de.free(allocator.?, value);
try if (unmanaged) hash_map.put(allocator.?, key, value) else hash_map.put(key, value);
}
return hash_map;
}
};
}
const Omg = struct {
children: std.AutoArrayHashMap(u32, u32),
};
_ = try json.fromSliceWith(null, Omg, file_data, model_serialization.deserialization_blocks);
No response
If you want to deserialize into type T
from either a JSON map or a JSON list, you can't really do that right now since you have to specify deserializeMap
or deserializeSequence
in your deserialize
function.
Say you've written a library with the following type:
const Point = struct {
x: i32,
y: i32,
};
Now, Point
is already serializable by Getty, so users can immediately serialize it into JSON without having to provide a serialization block/tuple. However, suppose that you, the author of the library, want Point
to be serialized as a JSON list by default, not an object. Currently, there's no way to do this. There's no way to specify a default (de)serialization behavior for a type.
Instead, what you have to do is define a public SBT for Point
that users can import. There are two main problems with this:
Point
contained a type from another library, which itself has an associated SBT, the user would have to 1) know about the second SBT and 2) provide both SBTs to Getty in order to properly serialize a Point
value.These two problems make Getty completely impractical for real-world use whenever packages become involved.
Currently, there are only three entities that can define blocks and tuples (BT): Getty, (de)serializers, and users. This proposal adds a fourth entity: types. Specifically, any type that can define declarations (i.e., struct
, enum
, union
).
To avoid name conflicts, a type-defined BT must be named either @"getty.sbt"
or @"getty.dbt"
, depending on whether it's intended for serialization or deserialization.
With type-defined BTs, here's the new selection process Getty will go through:
const std = @import("std");
const json = @import("json");
const Point = struct {
x: i32,
y: i32,
pub const @"getty.sbt" = .{struct {
pub fn is(comptime T: type) bool {
return T == Point;
}
pub fn serialize(value: anytype, serializer: anytype) !@TypeOf(serializer).Ok {
var s = try serializer.serializeSeq(2);
const seq = s.seq();
try seq.serializeElement(value.x);
try seq.serializeElement(value.y);
return try seq.end();
}
}};
};
pub fn main() anyerror!void {
const string = try json.toSlice(std.heap.page_allocator, Point{ .x = 1, .y = 2 });
std.debug.print("{s}\n", .{string});
}
No response
No response
Currently, all modifications to the (de)serialization process for a type is required to be specified in the serialize
and deserialize
+ Visitor
functions of SBTs and DBTs, respectively.
This includes even small modifications. For example, if you want to skip a field when serializing a struct
, you need to manually write out the entire sequence serialization process in serialize
. Modifications like these are common enough that users shouldn't have to rewrite them every single time.
A new declaration called attributes
will be added to SBTs and DBTs.
The declaration will be mutually exclusive with all existing SBT/DBT declarations except is
(i.e., serialize
, deserialize
, and Visitor
). In other words, you'll have two options when it comes to customizing Getty's behavior now:
There are three kinds of attributes a user can specify: field, variant, and container. Field attributes correspond to the fields of a struct, variant attributes correspond to the variants of an enum or union, and container attributes correspond to the struct, enum, or union itself.
The attributes
declaration is a struct
, where each field matches either the word Container
or a field/variant name in the type being (de)serialized. For example:
const Point = struct {
x: i32,
y: i32,
pub const @"getty.sbt" = struct {
pub fn is(comptime T: type) bool {
return T == Point;
}
pub const attributes = .{
.Container = .{ .name = "Coordinate" },
.x = .{ .name = "X" },
.y = .{ .skip = true },
};
};
};
If an SBT or DBT does not specify attributes, then validation is the same (i.e., the interface validates them). Otherwise, the validation will be deferred until Getty actually performs (de)serialization (i.e., when it's choosing SBTs/DBTs).
This enables the attributes
declaration to remain a simple anonymous struct and avoids requiring the user to pass in the type that they're trying to (de)serialize as part of the attributes
declaration, which becomes an issue for anonymous and generic types.
If a Getty serializer has a method that is unimplemented (i.e., undefined) and it is called, the following LLVM error is raised:
LLVM Emit Object... LLVM ERROR: Cannot select: 0x7ff488985fb8: ch,glue = X86ISD::CALL 0x7ff48a18d208, undef:ch, Register:i64 $rdi, Register:i32 $esi, RegisterMask:Untyped, 0x7ff48a18d208:1, ser/interfaces/serializer.zig:239:48
0x7ff488985ba8: i64 = Register $rdi
0x7ff4882d7db0: i32 = Register $esi
0x7ff488985800: Untyped = RegisterMask
0x7ff48a18d208: ch,glue = CopyToReg 0x7ff488985868, Register:i32 $esi, 0x7ff488986498, 0x7ff488985868:1, ser/interfaces/serializer.zig:239:48
0x7ff4882d7db0: i32 = Register $esi
0x7ff488986498: i32 = any_extend 0x7ff488985f50, ser/interfaces/serializer.zig:239:48
0x7ff488985f50: i8,ch = CopyFromReg 0x7ff489330868, Register:i8 %3, ser/interfaces/serializer.zig:239:48
0x7ff488986430: i8 = Register %3
0x7ff488985868: ch,glue = CopyToReg 0x7ff48aa3eaa0, Register:i64 $rdi, 0x7ff4882d85d0, ser/interfaces/serializer.zig:239:48
0x7ff488985ba8: i64 = Register $rdi
0x7ff4882d85d0: i64,ch = CopyFromReg 0x7ff489330868, Register:i64 %1, ser/interfaces/serializer.zig:239:48
0x7ff48a18d000: i64 = Register %1
In function: ser.interfaces.serializer.Serializer(main.Serializer,void,error{Io,Syntax},.{ser.blocks.array_list, ser.blocks.bounded_array, ser.blocks.hash_map, ser.blocks.linked_list, ser.blocks.tail_queue, ser.blocks.array, ser.blocks.bool, ser.blocks.enum, ser.blocks.error, ser.blocks.float, ser.blocks.int, ser.blocks.null, ser.blocks.optional, ser.blocks.pointer, ser.blocks.slice, ser.blocks.string, ser.blocks.struct, ser.blocks.tuple, ser.blocks.union, ser.blocks.vector, ser.blocks.void},.{ser.blocks.array_list, ser.blocks.bounded_array, ser.blocks.hash_map, ser.blocks.linked_list, ser.blocks.tail_queue, ser.blocks.array, ser.blocks.bool, ser.blocks.enum, ser.blocks.error, ser.blocks.float, ser.blocks.int, ser.blocks.null, ser.blocks.optional, ser.blocks.pointer, ser.blocks.slice, ser.blocks.string, ser.blocks.struct, ser.blocks.tuple, ser.blocks.union, ser.blocks.vector, ser.blocks.void},lib.TODO,lib.TODO,lib.TODO,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined).getty.Serializer.serializeBool
error: test...
error: The following command terminated unexpectedly:
[...]
Before, we would get a nice "use of undefined value" compile error. And I believe that's still the case with deserializers. I'm not sure if this has something to do with serializers or if it's a Zig issue.
Run the following:
const std = @import("std");
const getty = @import("getty");
const Serializer = struct {
pub usingnamespace getty.Serializer(
@This(),
Ok,
Error,
getty.default_st,
getty.default_st,
getty.TODO,
getty.TODO,
getty.TODO,
undefined,
undefined,
undefined,
undefined,
undefined,
undefined,
undefined,
undefined,
undefined,
undefined,
undefined,
);
const Ok = void;
const Error = error{ Io, Syntax };
};
pub fn main() anyerror!void {
var s = Serializer{};
const serializer = s.serializer();
try getty.serialize(true, serializer);
}
No response
The interface function for getty.de.Map
should also be renamed from seq to mapAccess
.
Deserializing tuples of tuples seems to break the compiler now.
Uncomment the relevant test in the "tuple" test in src/tests/de/tests.zig
and run zig build test
.
This is due to stage2.
The type returned by Attributes
contains all fields/variants of the passed-in type.
There are a few problems with this:
serialize
function, when serializing a struct
or enum
, the field serialization loop would be unrolled for every single type in the struct
/enum
being serialized, which may be quite large. The compiler will prune irrelevant branches but that's still compute that can be avoided.The attribute map type should consist of just the fields specified in attributes
, as long as they are all valid.
The Attributes
function's signature should be changed as follows:
-pub fn Attributes(comptime T: type, attributes: _Attributes(T)) type
+pub fn Attributes(comptime T: type, attributes: anytype) type
Then, Attributes
calls another function which takes both of Attributes
's parameters as input. This function then goes over each field in attributes
and see if T
has a corresponding field or if the field is named after T
.
fn _Attributes(comptime T: type, attributes: anytype) type {
if (@typeInfo(@TypeOf(attributes)) != .Struct) {
@compileError("expected attribute map, found " ++ @typeName(@TypeOf(attributes)));
}
// Set up attribute map type
// ...
inline for (std.meta.fields(@TypeOf(attributes))) |field| {
if (std.meta.trait.hasField(field.name) or std.mem.eql(u8, field.name, @typename(T)) {
// Add appropriate inner attribute struct to our attribute map type
}
@compileError("invalid field");
}
// Return attribute map type
// ...
}
Type-defined blocks currently need an is
function to be specified, like all SBs. However, they're unnecessary since TBs only apply to a single type.
Remove the need for an is
function in type-defined blocks!
No response
No response
The interface approach to (de)serializables is a bit useless.
Unlike in Rust where you can implement Serialize
and Deserialize
on primitive and user data types, you can only do the latter in Zig. One consequence of this is that primitives are assumed to be implicitly (de)serializable by Getty. In contrast, user data types are not assumed to be implicitly (de)serializable.
This is a problem for a few reasons:
Serialize
or Deserialize
parameters as that would only allow struct
s and enum
s.Serialize
or Deserialize
for data types outside of your code (e.g., ArrayList
).Serialize
and Deserialize
implementations for any user data type (comptime
ftw). There's no need for an interface when we literally just implement the behavior. Thus, user-provided implementations simply become a mechanism for customization, which can easily be handled by something like std.meta.hasFn
.The proposal is to make all types implicitly (de)serializable. Everything pretty much stays the same, except now there's no interface to implement.
const getty_json = @import("getty_json");
const std = @import("std");
const allocator = std.testing.allocator;
const print = std.debug.print;
const Point = struct {
x: i32,
y: i32,
};
pub const main = fn() anyerror!void {
const point = Point{ .x = 1, .y = 2 };
const json = try getty_json.toArrayList(allocator, point);
defer json.deinit();
print("{s}\n", .{json.items}); // {"x":1,"y":2}
}
For attributes, it would largely be the same, except you use an Attributes
function:
const getty = @import("getty");
const Point = struct {
usingnamespace getty.Attributes(.{ .x = .{ .rename = "a" } });
x: i32,
y: i32,
};
Currently, type-defined (de)serialization tuples can be created.
However, tuples are for grouping blocks that apply to different types, but type-defined serialization logic can only ever apply to the type that defined it, meaning tuples are useless.
Remove type-defined tuples!
No response
No response
Have deserializer
come before T
.
Currently, methods like serializeSeq
return a value of a type that implements getty.ser.Seq
.
I think it'd be better if those methods returned the interface value instead
s.seq()
thing even though the blocks always use the return value of serializeSeq
solely for its interface behavior. They never care about the actual implementation, so they shouldn't have access to it in the first place.getty.de.SeqAccess
to visitors.Serializers should pass in Seq.@"getty.ser.Seq
for the Seq
parameter in their call to getty.Serializer
(and Getty should check at comptime that it's a getty.ser.Seq
). With that, I think Getty should be able to use it as the return value for serializeSeq
.
This would make it to where serializers are the ones writing s.seq()
instead of blocks, which is better since the serializers are the ones defining the implementations and so it's fine if they can look at the implementation's details and guts. The blocks shouldn't have access to that stuff though.
No response
No response
A deserialized value is one that is produced by a deserializer. One important property of a deserialized value is that it must be independent of any deserializer or access type. This means, for instance, deserialized pointers must be allocated 1.
In other words, if the nextKeySeed
function of a MapAccess
returns a pointer which points into a JSON token stream maintained by a deserializer, the returned pointer is not a deserialized value since the value's lifetime depends on the lifetime of a deserializer. The correct implementation would be for nextKeySeed
to return an allocated string, and for visitors that call the function to assume the returned value is allocated since it is a pointer.
Deserialized pointers could also reference global constants but Getty (specifically, getty.free
) assumes that all pointers are allocated since that's a more common/reasonable way to do things. ↩
I forgot that using function checks instead of decl checks raises compile errors for things like serializeMap
and serializeVoid
if they're method implementations are left undefined.
The error set of the deserializer is passed into the visitors, but how do the visitors know which error within that set they can return?
https://aquilla.red is a package index for zigmod
This would make it simple to add this package to a project via zigmod:
zigmod aq add 1/getty-zig/getty
zigmod fetch
I feel like getty is really missing a visitor for tagged unions. I think this could be done similar to the default behaviour of serde.
Starting from Zig version 67154d2, std.testing.expect*
functions can fail, so they need to be updated with try
.
On zig
master (first seen in 0.10.0-dev.3978+4fd4c733d
), stage2 compilations for some programs implementing getty.Serializer
(and getty.Deserializer
probably) are going into an infinite loop and running forever crashing.
The problem seems to occur when a method implementation is provided for a required method that contains an anytype
parameter (e.g., serializeFloat
, serializeInt
). This causes the infinite loop crash. Providing implementations for methods like serializeBool
or serializeNull
still works completely fine.
const std = @import("std");
const getty = @import("getty");
const Serializer = struct {
pub usingnamespace getty.Serializer(
@This(),
Ok,
Error,
null,
null,
null,
null,
null,
.{
.serializeBool = serializeBool,
},
);
const Ok = void;
const Error = getty.ser.Error || error{Foobar};
fn serializeBool(_: @This(), value: bool) !Ok {
std.debug.print("{}\n", .{value});
}
};
pub fn main() anyerror!void {
var s = Serializer{};
const serializer = s.serializer();
try getty.serialize(null, true, serializer);
}
const std = @import("std");
const getty = @import("getty");
const Serializer = struct {
pub usingnamespace getty.Serializer(
@This(),
Ok,
Error,
null,
null,
null,
null,
null,
.{
.serializeInt = serializeInt,
},
);
const Ok = void;
const Error = getty.ser.Error || error{Foobar};
fn serializeInt(_: @This(), value: anytype) !Ok {
std.debug.print("{}\n", .{value});
}
};
pub fn main() anyerror!void {
var s = Serializer{};
const serializer = s.serializer();
try getty.serialize(null, 123, serializer);
}
$ zig build run
zig build-exe test Debug native: error: thread 431519 panic: attempt to unwrap error
Unable to dump stack trace: debug info stripped
zig build-exe test Debug native: error: the following command terminated unexpectedly:
/Users/jason/.asdf/installs/zig/master/zig build-exe /Users/jason/Projects/Personal/test/src/main.zig --cache-dir /Users/jason/Projects/Personal/test/zig-cache --global-cache-dir /Users/jason/.cache/zig --name test --mod getty::/Users/jason/.cache/zig/p/1220c649e62d6bb0e07ac6b02711f76254f1dd1875859ad075873d248fb820f68d66/src/getty.zig --deps getty --listen=-
Build Summary: 0/5 steps succeeded; 1 failed (disable with -fno-summary)
run transitive failure
└─ run test transitive failure
├─ zig build-exe test Debug native failure
└─ install transitive failure
└─ install test transitive failure
└─ zig build-exe test Debug native (reused)
error: the following build command failed with exit code 1:
/Users/jason/Projects/Personal/test/zig-cache/o/ee1be17a562f060aa4ae0c496b5dcfcc/build /Users/jason/.asdf/installs/zig/master/zig /Users/jason/Projects/Personal/test /Users/jason/Projects/Personal/test/zig-cache /Users/jason/.cache/zig run
Compiling with stage1 fixes the problem. So if you need a workaround for now, the following should work for you:
zig build -fstage1 run
In the following code, trait.isZigString(T)
should be made comptime
.
Line 50 in 721e88d
The issue I think is that the comptime
applies to the next condition as well, which throws up an error. Need to figure out a clean way around that.
In order to signal that a required method of a Getty interface is not implemented, the implementing type must currently pass undefined
into the interface for that required method. Obviously, this isn't ideal.
The best option would be to pass in null
for unimplemented methods. In fact, that's what Getty used to do. However, it was found that Zig would crash whenever an unimplemented generic method was called. And so, Getty transitioned from optional required methods to mandatory ones.
However! I just tried out optional required methods on a whim today, and it looks like they may be possible now!
Required method parameters in Getty interfaces can be marked as optional. For example,
fn BoolSerializer(
comptime Context: type,
comptime O: type,
comptime E: type,
comptime serializeBoolFn: ?fn (Context, anytype) E!O,
) type
Then, the interface's implementation would looks something like this:
fn BoolSerializer(
comptime Context: type,
comptime O: type,
comptime E: type,
comptime serializeBoolFn: ?fn (Context, anytype) E!O,
) type {
return struct {
pub const Interface = struct {
context: Context,
pub const Ok = O;
pub const Error = E;
pub fn serializeBool(self: @This(), value: anytype) Error!Ok {
if (serializeBoolFn) |f| {
return f(self.context, value);
}
@compileError("unimplemented"); // This is only raised if serializeBoolFn is null!
}
};
pub fn boolSerializer(self: Context) Interface {
return .{ .context = self };
}
};
}
The implementation side of things would remain the same, except they can now pass in null
instead of undefined
for unimplemented required methods.
No idea why this all works now, and I'll have to do a bit more testing to see whether this works for everything (e.g., deserialization), but I'm really excited about this!
Currently, to implement Serialize
and Deserialize
, you need the following:
fn serialize(self: Self, comptime S: type, serializer: *S) S.Error!S.Ok
fn deserialize(self: Self, comptime D: type, deserializer: *D) S.Error!S.Ok
We can simplify this by using anytype
since that's basically just sugar for what we have above:
const std = @import("std");
fn ErrorUnion(comptime T: type) type {
return switch (@typeInfo(T)) {
.Pointer => std.meta.Child(T).Error!std.meta.Child(T).Ok,
else => @compileError("expected pointer, found " ++ @typeName(T)),
};
}
fn serialize(self: Self, serializer: anytype) ErrorUnion(@TypeOf(serializer))
fn deserialize(self: Self, deserializer: anytype) ErrorUnion(@TypeOf(deserializer))
We can also do the same for getty.serialize
.
This might also let us use the Allocator
interface design, but I'm not sure.
Edit: Being able to use the Allocator
design might not even make sense for Serialize
and Deserialize
, since being able to pass around Serialize
s or Deserialize
s is practically useless since you can't pass around primitives. So maybe we just simplify and then continue using the writer/reader design.
Currently, Getty assumes that all deserialized pointers are allocated. Initially, I felt that this assumption was a reasonable one to make, is consistent, and made Deserializer
and Visitor
implementations simple.
The issue, however, is that not all deserialized pointers are part of the value produced by a Visitor
. For example, during struct
deserialization, deserialized keys (i.e., slices) are used only to check against a struct
's field name. The key isn't a part of the final struct
value at all. In such cases, allocating the key is not only pointless, it's terrible for performance.
The current solution around this is to, in the deserializer implementation, provide a custom deserialization block for struct
values. But that's a lot of work that will have to be repeated across libraries and it doesn't change the fact that the default behavior for struct
deserialization is slow.
So, I propose that Getty should assume that all deserialized pointers are allocated except during struct
key deserialization (struct
value deserialization will still assume that pointers are allocated). This will make Deserializer
implementations better and faster, and Visitor
impls can still be kept simple.
Zig 0.10.0-master (downloaded yesterday)
The serialization test passes:
test "struct with nullables" {
const T2 = struct { a: ?u32, b: ?u32, c: ?u32 = null, d: ?u32 = 5 };
try t(T2{ .a = 1, .b = null }, &[_]Token{
.{ .Struct = .{ .name = @typeName(T2), .len = 4 } },
.{ .String = "a" },
.{ .Some = {} },
.{ .U32 = 1 },
.{ .String = "b" },
.{ .Null = {} },
.{ .String = "c" },
.{ .Null = {} },
.{ .String = "d" },
.{ .Some = {} },
.{ .U32 = 5 },
.{ .StructEnd = {} },
});
}
but the same code fails for deserialization. In fact, just this part is already broken:
const T1 = struct { a: ?u32 = 4 };
try t(T1{ .a = 1 }, &[_]Token{
.{ .Struct = .{ .name = @typeName(T1), .len = 1 } },
.{ .String = "a" },
.{ .Some = {} },
.{ .U32 = 1 },
.{ .StructEnd = {} },
});
The error:
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-d032ff42\pkg\src\de\impls\visitor\struct.zig:72:61: error: expected type '?u32', found '*const anyopaque'
@field(structure, field.name) = default;
^~~~~~~
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-d032ff42\pkg\src\lib.zig:41:9: error: concept `getty.Deserializer` was not satisfied: missing `context` field
@compileError("concept `" ++ concept ++ "` was not satisfied: " ++ msg);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-d032ff42\pkg\src\de\concepts\deserializer.zig:10:25: note: called from here
concepts.err(concept, "missing `context` field");
~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-d032ff42\pkg\src\de\interfaces\visitor.zig:137:47: note: called from here
concepts.@"getty.Deserializer"(@TypeOf(deserializer));
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~
I went on patching the src/de/impls/visitor/struct.zig:72
:
@field(structure, field.name) = default;
⬇
@field(structure, field.name) = @as(field.field_type, default);
but then I get another result that I don't know how to fix, also it feels like a compiler bug?
All 21 tests passed.
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-d032ff42\pkg\src\de\impls\visitor\struct.zig:35:28: error: expected type '?u32', found '*const anyopaque'
inline for (fields) |field, i| {
~~~~~~~^~~
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-d032ff42\pkg\src\lib.zig:41:9: error: concept `getty.Deserializer` was not satisfied: missing `context` field
@compileError("concept `" ++ concept ++ "` was not satisfied: " ++ msg);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-d032ff42\pkg\src\de\concepts\deserializer.zig:10:25: note: called from here
concepts.err(concept, "missing `context` field");
~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
W:\dev\definitive_dialoguer\.gyro\getty-getty-zig-github.com-d032ff42\pkg\src\de\interfaces\visitor.zig:137:47: note: called from here
concepts.@"getty.Deserializer"(@TypeOf(deserializer));
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~
The mentioned de/impls/visitor/struct.zig
:
errdefer {
if (allocator) |alloc| {
inline for (fields) |field, i| { // <--- line 35 from the error
if (!field.is_comptime and seen[i]) {
getty.de.free(alloc, @field(structure, field.name));
}
}
}
}
add deserialization test to src/tests/ser/tests.zig
and run it with zig build test
in the main folder:
test "struct with nullables" {
const T1 = struct { a: ?u32 = 4 };
try t(T1{ .a = 1 }, &[_]Token{
.{ .Struct = .{ .name = @typeName(T1), .len = 1 } },
.{ .String = "a" },
.{ .Some = {} },
.{ .U32 = 1 },
.{ .StructEnd = {} },
});
const T2 = struct { a: ?u32, b: ?u32, c: ?u32 = null, d: ?u32 = 5 };
try t(T2{ .a = 1, .b = null }, &[_]Token{
.{ .Struct = .{ .name = @typeName(T2), .len = 4 } },
.{ .String = "a" },
.{ .Some = {} },
.{ .U32 = 1 },
.{ .String = "b" },
.{ .Null = {} },
.{ .String = "c" },
.{ .Null = {} },
.{ .String = "d" },
.{ .Some = {} },
.{ .U32 = 5 },
.{ .StructEnd = {} },
});
}
No response
The proposal in #55 got me thinking.
In Getty JSON, the implementation for the aggregate serialization interfaces (e.g., getty.ser.Seq
, getty.ser.Map
) has a field referencing the serializer passed to serializeSeq
, serializeStruct
, etc. I believe the field is used in every single one of the aggregate interface methods (no surprise there; most of them are literally called serializeX
).
But having to carry this field around in the implementation isn't ideal. Any SB that calls serializeSeq
already has a reference to the same serializer, so there's no need to store it in the implementation. Also, it makes more sense to have the serializer as a parameter of the methods doing the serialization than on the impl itself.
So, what if we pass in the serializer to the methods like serializeElement
, serializeKey
, and so on? The SB calling serializeStruct
would always have a reference to the relevant serializer so it'd be easy to pass it in for them.
This would simplify implementations of the aggregate interfaces since they'd no longer need to keep a redundant serializer around.
No response
No response
I couldn't decide whether this should be 1 ticket or 2. But my first impression was that serialization generally works and then it turned out I can't move forward because the deserialization doesn't. The latter even requires more than just implementing custom blocks, as mentioned in #37.
pub fn main() void {
const Test = struct {
some_map: std.AutoHashMap(c_int, c_int),
};
var obj: Test = .{
.some_map = std.AutoHashMap(c_int, c_int).init(std.heap.page_allocator),
};
_ = obj.some_map.put(3, 123) catch unreachable;
var serialized = json.toPrettySlice(gpa.allocator(), obj) catch unreachable;
std.debug.print("{s}", .{serialized});
}
This prints:
{
"some_map": {
3: 123
}
}
However, the key 3
is invalid because JSON format only allows key names to be strings. The result is supposed to be:
{
"some_map": {
"3": 123
}
}
However, for deserialization of the same struct, like this:
var deserialized = json.fromSlice(std.heap.page_allocator, @TypeOf(obj), "{\"some_map\": {\"3\": 123}}") catch unreachable;
std.debug.print("{any}", .{deserialized});
we get an error:
W:\dev\definitive_dialoguer\.gyro\concepts-ibokuri-github.com-05c73681\pkg\src\lib.zig:75:21: error: concept `StringKey` was not satisfied (expected key type to be `[]const u8`)
else => @compileError(base ++ extra),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~
W:\dev\definitive_dialoguer\.gyro\concepts-ibokuri-github.com-05c73681\pkg\src\lib.zig:48:57: note: called from here
for (results) |result| if (!result) fail(concept, msg);
~~~~^~~~~~~~~~~~~~
W:\dev\definitive_dialoguer\.gyro\json-getty-zig-github.com-5753d08c\pkg\src\de\deserializer.zig:326:91: note: called from here
comptime concepts.Concept("StringKey", "expected key type to be `[]const u8`")(.{
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
It is because MapAccess.nextKeySeed
has this restriction:
comptime concepts.Concept("StringKey", "expected key type to be `[]const u8`")(.{
concepts.traits.isSame(@TypeOf(seed).Value, []const u8),
});
As I mentioned in #37 I can't get through this error by implementing a custom deserialization block.
The interface function for getty.de.Seq
should also be renamed from seq
to seqAccess
.
std.net.Address
can't be (de)serialized atm on Windows due to an issue in resolveIP
(I think). Here's a build that failed due to this issue.
In the net_address.zig
block files, remove the enclosing if (builtin.os.tag != .windows)
statements from the tests and then run the tests on Windows.
No response
const Point = struct {
usingnamespace Attributes(@This(), .ser, .{ .x = .{ .rename = "a" } });
usingnamespace Attributes(@This(), .de, .{ .x = .{ .rename = "de_a" } });
x: i32,
y: i32,
};
const Point = struct {
usingnamespace Attributes(@This(), .{ .x = .{ .rename = "a", .rename_deserialize = "de_a" } });
x: i32,
y: i32,
};
The visitors aren't getting referenced by the tests which is a bit annoying.
This is how struct visitors deallocate memory when a deserialization error occurs:
errdefer {
if (allocator) |alloc| {
inline for (fields) |field, i| {
if (!field.is_comptime and seen[i]) {
de.free(alloc, @field(structure, field.name));
}
}
}
}
And this is how the array visitor does deallocation:
errdefer {
if (allocator) |alloc| {
if (array.len > 0) {
var i: usize = 0;
while (i < seen) : (i += 1) {
de.free(alloc, array[i]);
}
}
}
}
Everything is very manual and annoying to write. And even for user-defined types like std.ArrayList
, we have to defer to getty.de.free
, which itself has to manually walk through the list and frees each child element before freeing the actual list.
Wrap the allocator passed to Getty in an arena allocator. Visitors can then just allocate whatever they want, and all the correct allocations will be freed with a simple call to allocator.free
.
No response
No response
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.