Giter VIP home page Giter VIP logo

typify's Introduction

Typify

Typify compiles JSON Schema documents into Rust types. It can be used in one of several ways:

  • using the cargo typify command

  • via the macro import_types!("types.json") to generate Rust types directly in your program

  • via a builder interface to generate Rust types in build.rs or xtask

  • via the builder functions to generate persistent files e.g. when building API bindings

If generation fails, doesn't compile or is generally lousy: Please file an issue and include the JSON Schema and Rust output (if there is any). Use cargo typify command to generate code from the command-line. It's even more helpful if you can articulate the output you'd ideally like to see.

JSON Schema โ†’ Rust types

Typify translates JSON Schema types in a few different ways depending on some basic properties of the schema:

Built-in types

Integers, floating-point numbers, strings, etc. Those all have straightforward representations in Rust. The only significant nuance is how to select the appropriate built-in type based on type attributes. For example, a JSON Schema might specify a maximum and/or minimum that indicates the appropriate integral type to use.

String schemas that include a format are represented with the appropriate Rust type. For example { "type": "string", "format": "uuid" } is represented as a uuid::Uuid (which requires the uuid crate be included as a dependency).

Arrays

JSON Schema arrays can turn into one of three Rust types Vec<T>, HashSet<T>, and tuples depending on the schema properties. An array may have a fixed length that matches a fixed list of item types; this is well represented by a Rust tuples. The distinction between Vec<T> and HashSet<T> is only if the schema's uniqueItems field is false or true respectively.

Objects

In general, objects turn in to Rust structs. If, however, the schema defines no properties, Typify emits a HashMap<String, T> if the additionalProperties schema specifies T or a HashMap<String, serde_json::Value> otherwise.

Properties that are not in the required set are typically represented as an Option<T> with the #[serde(default)] attribute applied. Non-required properties with types that already have a default value (such as a Vec<T>) simply get the #[serde(default)] attribute (so you won't see e.g. Option<Vec<T>>).

OneOf

The oneOf construct maps to a Rust enum. Typify maps this to the various serde enum types.

AllOf

The 'allOf' construct is handled by merging schemas. While most of the time, typify tries to preserve and share type names, it can't always do this when merging schemas. You may end up with fields replicated across type; optimizing this generation is an area of active work.

AnyOf

The anyOf construct is much trickier. If can be close to an enum (oneOf), but where no particular variant might be canonical or unique for particular data. While today we (imprecisely) model these as structs with optional, flattened members, this is one of the weaker areas of code generation.

Issues describing example schemas and desired output are welcome and helpful.

Formatting

You can format generated code using crates such as rustfmt-wrapper and prettyplease. This can be particularly useful when checking in code or emitting code from a build.rs.

The examples below show different ways to convert a TypeSpace to a string (typespace is a typify::TypeSpace).

rustfmt

Best for generation of code that might be checked in alongside hand-written code such as in the case of an xtask or stand-alone code generator (list cargo-typify).

rustfmt_wrapper::rustfmt(typespace.to_stream().to_string())?

prettyplease

Best for build.rs scripts where transitive dependencies might not have rustfmt installed so should be self-contained.

prettyplease::unparse(&syn::parse2::<syn::File>(typespace.to_stream())?)

No formatting

If no human will ever see the code (and this is almost never the case).

typespace.to_stream().to_string()

WIP

Typify is a work in progress. Changes that affect output will be indicated with a breaking change to the crate version number.

In general, if you have a JSON Schema that causes Typify to fail or if the generated type isn't what you expect, please file an issue.

There are some known areas where we'd like to improve:

Complex JSON Schema types

JSON schema can express a wide variety of types. Some of them are easy to model in Rust; others aren't. There's a lot of work to be done to handle esoteric types. Examples from users are very helpful in this regard.

Bounded numbers

Bounded numbers aren't very well handled. Consider, for example, the schema:

{
  "type": "integer",
  "minimum": 1,
  "maximum": 6
}

The resulting types won't enforce those value constraints.

Configurable dependencies

A string schema with format set to uuid will result in the uuid::Uuid type; similarly, a format of date translates to chrono::naive::NaiveDate. For users that don't want dependencies on uuid or chrono it would be useful for Typify to optionally represent those as String (or as some other, consumer-specified type).

typify's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

typify's Issues

handling for arbitrary derive cycles

JSON Schema can define types that have cycles. This is simple to handle in languages like JavaScript or Java, but more complex in Rust since those cycles must be explicitly broken with a Box<T>. Note that use of a Vec or HashMap also breaks the containment cycle but has implications for derive computation which we will discuss later. Note too that where one "breaks" a cycle may have multiple solutions, some that require more breaks than others. Note also that it may not be feasible to reconstruct the types e.g. if the JSON Schema were derived from Rust types because the information about Box indirections is explicitly discarded (and probably reasonably so, but one could imagine including hints; more on that later as well).

Currently we break trivial A -> A cycles such as:

struct A {
    a: Option<Box<A>>, // this needs to be boxed
}

We can do this without a bunch of graph traversal and it solved a proximate problem.

The more general case requires us to decompose the type graph into strongly connected subgraphs that form a DAG (e.g. with algorithms proposed by Tarjan, Dijkstra or Kosaraju). In this case, the edges are defined by structure or newtype containment either directly or via an Option type. Within each strongly connected subgraph we then would determine where to "break" the cycles by inserting Boxes. The general case of this requires exponential time to compute. While the number of nodes (types) in a cycle is likely to be small, we still may elect for a heuristic, the simplest of which would be to cut all edges. There's very little harm in cutting more than is absolutely required--the serialization isn't affected for example--the only consequence is to the legibility and ergonomics of the generated types.

For JSON Schema generated from rust types, it could be helpful to annotate boxed types with an extension. This could act as a heuristic when slicing a strongly connected component i.e. we use these extensions to see if they properly break the containment cycle and do something else if they don't.


The derive macros we apply to types have a similar problem. Consider, for example, the following type:

struct A {
    value: u32,
}

For this struct we could #[derive(Eq, PartialEq)], but if we change the u32 to an f32 we could not! A Vec<T> is Eq only if T: Eq and a HashSet<T> isn't Ord regardless of the traits implemented by T.

From the list of desirable traits to implement such as Hash, Ord, and Eq, the ones we can apply to a type depend on the types to which it refers. And those references may form a cycle. As above, we must compute the strongly connected components. Above the edges were containment; here the edges are all references (i.e. a Vec is an edge here but not above`). Within each strongly connected component we must take the intersection of all supportable traits.

Allow replacing `chrono` types

Currently the generated types for date-format and date are using chrono based types, partially marked as deprecated. It'd be great if the types could be replaced with custom types.

Panick for float with minimum/maximum/multiple_of/etc

Sample schema which triggers this panick:

    "idNum": {
      "type": "integer",
      "maximum": 30,
      "minimum": 20,
      "multipleOf": 5
    }

Result:

thread 'main' panicked at 'assertion failed: validation.minimum.is_none()', /home/jayvdb/.cargo/git/checkouts/typify-288d5a84bbbe6a46/aa5e5a6/typify-impl/src/convert.rs:720:13

#109 by @justinmmott provided a fix for that, but has been closed unmerged.
Keen to hear what are the next steps to resolving this. I can help get this solved, I will be able to use unpatched typify, which makes it much easier to be maintaining a few enhancements for progenitor such as oxidecomputer/progenitor#252

Enums should impl `ToString` or `Display`

There's an upstream failure in progenitor where enums in a path that have variants which take a single parameter can't be converted to a sting because they don't implement Display. See the upstream issue oxidecomputer/progenitor#253. To support this usecase we'll need enums to either derive Display or implement ToString.

handle multi-type array items that don't map directly to rust tuples

A JSON Schema array can have an array of types in the items field. If the maxItems and minItems both match the length of the items array, this type is well-modeled by a Rust tuple. If those cardinalities don't match, we need some more complex type to deal with optional, typed values, and a (potentially unbounded) collection of typed or untyped values past the end of the items array definitions.

Want a mechanism to avoid automatically applied derive macros

I have an OpenAPI document with a type PortId. That is an enum, but represented in the schema as a strictly-formatted string, as "prefixN", where the prefix is defined by the enum variant and N is a u8 limited to some range. For example, "qsfp4" or "rear0". Typify derives the ordering traits PartialOrd and Ord for this type, which delegate to the implementations for String. I would like to implement these traits myself, because (1) I'd like numeric sorting based on N rather than lexicographic, and (2) I'd like to put all values "rearM" before any `"qsfpN".

It would be nice to have escape hatches to ask typify not to generate certain derivable traits, ideally for specific types. Something like a mapping from type names to traits that should not be derived seems pretty flexible.

Support oneOf together with properties

When using properties and oneOf together, generation fails. I was using this pattern for a couple use cases:

  • require either, but not both (i've since removed this from the schema) - e.g.
    {
     "oneOf": [
       { "required": ["property1"] },
       { "required": ["property1"] }
     ]
    }
  • "inheritance" - add a set of additional optional properties on top of the oneOf choice. See ReferenceDef below.

These can both be worked around fairly trivially (and I think I probably will do that in this case) but these both seem like reasonable cases to handle.

Schema:

{
    "$schema": "https://json-schema.org/draft-07/schema#",
    "$id": "https://badmagic100.github.io/HollowKnight.Packages/hpackage.schema.json",
    "title": "Hollow Knight PackageDef",
    "description": "Package definition schema for Hollow Knight mods and modpacks",
    "type": "object",
    "required": ["name", "description", "repository", "assets"],
    "additionalProperties": false,
    "properties": {
        "name": {
            "type": "object"
        },
        "description": {
            "type": "string"
        },
        "author": {
            "type": "string"
        },
        "repository": {
            "$ref": "#/definitions/Link"
        },
        "assets": {
            "type": "array",
            "items": {
                "$ref": "#/definitions/Asset"
            }
        },
        "dependencies": {
            "$ref": "#/definitions/References"
        },
        "devDependencies": {
            "$ref": "#/definitions/References"
        }
    },
    "definitions": {
        "Link": {
            "type": "string",
            "format": "uri"
        },
        "Asset": {
            "oneOf": [
                {
                    "type": "string"
                },
                {
                    "$ref": "#/definitions/PlatformAsset"
                }
            ]
        },
        "PlatformAsset": {
            "type": "object",
            "required": ["platform", "path"],
            "additionalProperties": false,
            "properties": {
                "platform": {
                    "$ref": "#/definitions/Platform"
                },
                "path": {
                    "type": "string"
                }
            }
        },
        "Platform": {
            "type": "string",
            "enum": ["win32", "macos", "linux"]
        },
        "References": {
            "oneOf": [
                { "type": "array", "items": { "type": "string" }, "$comment": "An array of mod names. Version is inferred to be @modlinks." },
                { "type": "object", "additionalProperties": { "$ref": "#/definitions/StringVersion" }, "$comment": "A map of mod name to version." },
                { "type": "object", "additionalProperties": { "$ref": "#/definitions/ReferenceDef" }, "$comment": "A map of mod name to full version spec." }
            ]
        },
        "ReferenceDef": {
            "type": "object",
            "oneOf": [
                { "$ref": "#/definitions/GitReference" },
                { "$ref": "#/definitions/ModlinksReference" },
                { "$ref": "#/definitions/LinkReference" }
            ],
            "properties": {
                "alternateInstallName": {
                    "type": "string"
                },
                "fileType": {
                    "type": "string",
                    "enum": ["zip", "dll"]
                }
            }
        },
        "StringVersion": {
            "$comment": "A mod version identifier. Versions are processed, in precedence order, as follows:\n* Direct download URL\n* @latest or @modlinks, specifying the latest version in the git repo or the latest version on modlinks respectively\n* A modlinks version number, if that version is present on modlinks.\n* A github release tag",
            "type": "string"
        },
        "GitReference": {
            "type": "object",
            "additionalProperties": false,
            "properties": {
                "tag": {
                    "type": "string"
                },
                "useLatestRelease": {
                    "type": "boolean",
                    "const": true,
                    "$comment": "If this flag is set, tag is ignored."
                },
                "asset": {
                    "type": "string"
                }
            }
        },
        "ModlinksReference": {
            "type": "object",
            "additionalProperties": false,
            "properties": {
                "version": {
                    "type": "string"
                },
                "useLatestPublished": {
                    "type": "boolean",
                    "const": true,
                    "$comment": "If this flag is set, version is ignored."
                }
            }
        },
        "LinkReference": {
            "type": "object",
            "required": ["link"],
            "additionalProperties": false,
            "properties": {
                "link": {
                    "$ref": "#/definitions/Link"
                }
            }
        }
    }
}

cargo typify output:

Message:  not yet implemented: invalid (or unexpected) schema:
SchemaObject {
    metadata: None,
    instance_type: Some(
        Single(
            Object,
        ),
    ),
    format: None,
    enum_values: None,
    const_value: None,
    subschemas: Some(
        SubschemaValidation {
            all_of: None,
            any_of: None,
            one_of: Some(
                [
                    Object(
                        SchemaObject {
                            metadata: None,
                            instance_type: None,
                            format: None,
                            enum_values: None,
                            const_value: None,
                            subschemas: None,
                            number: None,
                            string: None,
                            array: None,
                            object: None,
                            reference: Some(
                                "#/definitions/GitReference",
                            ),
                            extensions: {},
                        },
                    ),
                    Object(
                        SchemaObject {
                            metadata: None,
                            instance_type: None,
                            format: None,
                            enum_values: None,
                            const_value: None,
                            subschemas: None,
                            number: None,
                            string: None,
                            array: None,
                            object: None,
                            reference: Some(
                                "#/definitions/ModlinksReference",
                            ),
                            extensions: {},
                        },
                    ),
                    Object(
                        SchemaObject {
                            metadata: None,
                            instance_type: None,
                            format: None,
                            enum_values: None,
                            const_value: None,
                            subschemas: None,
                            number: None,
                            string: None,
                            array: None,
                            object: None,
                            reference: Some(
                                "#/definitions/LinkReference",
                            ),
                            extensions: {},
                        },
                    ),
                ],
            ),
            not: None,
            if_schema: None,
            then_schema: None,
            else_schema: None,
        },
    ),
    number: None,
    string: None,
    array: None,
    object: Some(
        ObjectValidation {
            max_properties: None,
            min_properties: None,
            required: {},
            properties: {
                "alternateInstallName": Object(
                    SchemaObject {
                        metadata: None,
                        instance_type: Some(
                            Single(
                                String,
                            ),
                        ),
                        format: None,
                        enum_values: None,
                        const_value: None,
                        subschemas: None,
                        number: None,
                        string: None,
                        array: None,
                        object: None,
                        reference: None,
                        extensions: {},
                    },
                ),
                "fileType": Object(
                    SchemaObject {
                        metadata: None,
                        instance_type: Some(
                            Single(
                                String,
                            ),
                        ),
                        format: None,
                        enum_values: Some(
                            [
                                String("zip"),
                                String("dll"),
                            ],
                        ),
                        const_value: None,
                        subschemas: None,
                        number: None,
                        string: None,
                        array: None,
                        object: None,
                        reference: None,
                        extensions: {},
                    },
                ),
            },
            pattern_properties: {},
            additional_properties: None,
            property_names: None,
        },
    ),
    reference: None,
    extensions: {},
}

Support seconds/duration format

I have found a couple OApi Schemas out in the wild, (both from Golang shops) that use a type that is integer with a format of seconds .
The expected value is a human readable duration, e.g. 24h

So this would imply that they're expecting a duration type which can be set by human readable values, that evaluates to, well, seconds.

The API itself expects either the number of seconds as an integer, or a human readble duration string, e.g. 24h
I'm fine with treating seconds as a uint64. Thoughts?

Presently, typify just fails with a type error when it encounters this scenario. So at the very least, defaulting to whatever is the "default" for integer would be preferred.

`clone-on-copy` Clippy warning in generated code

The following impl results in the a Clippy (clone-on-copy) warning for simple enums that derive Copy:

impl From<&#type_name> for #type_name {
fn from(value: &#type_name) -> Self {
value.clone()
}
}

error: using `clone` on type `Decomposable` which implements the `Copy` trait
   --> /home/runner/work/substrait-rs/substrait-rs/target/debug/build/substrait-7d34e5b12ba1084d/out/substrait_text.rs:143:9
    |
143 |         value.clone()
    |         ^^^^^^^^^^^^^ help: try dereferencing it: `*value`
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#clone_on_copy
    = note: `-D clippy::clone-on-copy` implied by `-D warnings`

Log

cannot convert type string with format i64

When we have a property with some field like so:

        "properties": {
          "iterations": {
            "format": "int64",
            "type": "string"
          },

Typify crashes with a panic

Caused by:
  process didn't exit successfully: `/Users/<redacted>/projects/<redacted>/crates/target/debug/build/<redacted>-631c4662e317d757/build-script-build` (exit status: 101)
  --- stderr
  thread 'main' panicked at 'not yet implemented: Some(
      "int64",
  )', /Users/<redacted>/.cargo/git/checkouts/typify-288d5a84bbbe6a46/b712fe6/typify-impl/src/convert.rs:418:26
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Add an example to example macro

To help users understand how to use the types after doing import_types!, add a simple example to the example-macro soruces

Potentially misplaced `deny_unknown_fields` when schema has optional fields

I have a simple JSON schema:

{
    "$schema": "https://json-schema.org/draft/2019-09/schema",
    "title": "Example schema",
    "type": "object",
    "required": [
        "readings"
    ],
    "properties": {
        "readings": {
            "type": "object",
            "title": "Readings to take",
            "additionalProperties": {
                "type": "object",
                "title": "Reading schema"
            },
            "minProperties": 1
        }
    }
}

I.e. an object with a readings key, which is itself an object with one or more objects nested under it.

I ran cargo typify --no-builder on it (v0.0.11) and it yielded the following Rust definitions:

#![allow(clippy::redundant_closure_call)]
#![allow(clippy::needless_lifetimes)]
#![allow(clippy::match_single_binding)]
#![allow(clippy::clone_on_copy)]

use serde::{Deserialize, Serialize};

#[derive(Clone, Debug, Deserialize, Serialize)]
pub struct ExampleSchema {
    pub readings: ReadingsToTake,
}
impl From<&ExampleSchema> for ExampleSchema {
    fn from(value: &ExampleSchema) -> Self {
        value.clone()
    }
}
#[derive(Clone, Debug, Deserialize, Serialize)]
#[serde(deny_unknown_fields)]
pub struct ReadingsToTake {
    #[serde(flatten)]
    pub extra:
        std::collections::HashMap<String, std::collections::HashMap<String, serde_json::Value>>,
}
impl From<&ReadingsToTake> for ReadingsToTake {
    fn from(value: &ReadingsToTake) -> Self {
        value.clone()
    }
}

The #[serde(deny_unknown_fields)] line appears potentially misplaced, and prevents the parsing of valid JSON.

Let's say I have the JSON string

{
    "readings": {
        "something": {"a": 1}
    }
}

Trying to load this into the Rust structure results in an Error("unknown field something") - because the field is indeed not explicitly defined in the structure, and not allowed by the deny_unknown_fields.

I played around a bit, and it seems that if I remove either the additionalProperties or minProperties directives from the schema, the deny_unknown_fields line goes away.

It's quite possible that I'm doing something wrong, but at first pass this feels like a bug.

defaults

I notice that the generated code for schemas with defaults provides serde attributes such as default::default_i64<...>()
Is it intended that the user of code-generation produce those default functions? It's easy enough to do. I was just wondering if you had a library in mind.

Failing to parse spec with "multiple implied types for untyped enum"

Hi all, I'm trying to convert the latest Vega schema (v5.22.1 found here). I was working off the example-build, and had a build.rs with the following:

use std::{fs, path::Path};

use schemars::schema::Schema;
use typify::{TypeSpace, TypeSpaceSettings};

fn main() {
    let content = include_str!("v5.22.1.json");
    let schema = serde_json::from_str::<schemars::schema::RootSchema>(content).unwrap();

    let mut type_space = TypeSpace::new(TypeSpaceSettings::default().with_struct_builder(true));
    type_space.add_ref_types(schema.definitions).unwrap();
    let base_type = &schema.schema;
    // Only convert the top-level type if it has a name
    if (|| base_type.metadata.as_ref()?.title.as_ref())().is_some() {
        let _ = type_space.add_type(&Schema::Object(schema.schema)).unwrap();
    }

    let content = format!(
        "{}\n{}",
        "use serde::{Deserialize, Serialize};",
        type_space.to_string()
    );

    let out_file = Path::new("src/gen.rs").to_path_buf();
    fs::write(out_file, &content).unwrap();
}

This results in the following error:

thread 'main' panicked at 'multiple implied types for an un-typed enum {Number, Null, String} [Null, String("normal"), String("bold"), String("lighter"), String("bolder"), String("100"), String("200"), String("300"), String("400"), String("500"), String("600"), String("700"), String("800"), String("900"), Number(100), Number(200), Number(300), Number(400), Number(500), Number(600), Number(700), Number(800), Number(900)]'

It seems like it has trouble with portions of the schema that use something like fontWeight.

Any help (or mentoring to fix the issue) would be appreciated.

Found an error with a reference and instance_type

--- stdout
Writing to: .\src\behavior\animations\animations.rs

--- stderr
thread 'main' panicked at 'not yet implemented: invalid (or unexpected) schema:
SchemaObject {
metadata: None,
instance_type: Some(
Single(
String,
),
),
format: None,
enum_values: None,
const_value: None,
subschemas: None,
number: None,
string: None,
array: None,
object: None,
reference: Some(
"#/definitions/commands",
),
extensions: {},
}', C:\Users\miner.cargo\registry\src\github.com-1ecc6299db9ec823\typify-impl-0.0.10\src\convert.rs:395:36

The file that errored out

{
"$schema": "http://json-schema.org/draft-07/schema",
"$id": "blockception.minecraft.behavior.animation_controller",
"examples": [
  {
    "format_version": "1.19.0",
    "animation_controllers": {
      "controller.animation.example": {
        "initial_state": "default",
        "states": {
          "default": {
            "transitions": [
              {
                "state_1": "query.is_baby"
              }
            ]
          },
          "state_1": {}
        }
      }
    }
  }
],
"definitions": {
  "animationspec": {
    "anyOf": [
      {
        "title": "Animation Specification",
        "description": "A single string that specifies which animation there are.",
        "type": "string"
      },
      {
        "type": "object",
        "title": "Animation Specification",
        "description": "A object specification on when to animate.",
        "maxProperties": 1,
        "minProperties": 1,
        "additionalProperties": {
          "$ref": "#/definitions/A"
        }
      }
    ]
  },
  "particle_effect_spec": {
    "additionalProperties": false,
    "type": "object",
    "required": [
      "effect"
    ],
    "properties": {
      "bind_to_actor": {
        "type": "boolean",
        "title": "Bind To Actor",
        "description": "Set to false to have the effect spawned in the world without being bound to an actor (by default an effect is bound to the actor).",
        "const": false
      },
      "effect": {
        "type": "string",
        "title": "Effect",
        "description": "The name of a particle effect that should be played."
      },
      "locator": {
        "type": "string",
        "title": "Locator",
        "description": "The name of a locator on the actor where the effect should be located."
      },
      "pre_effect_script": {
        "type": "string",
        "title": "Pre Effect Script",
        "description": "A molang script that will be run when the particle emitter is initialized."
      }
    }
  },
  "commands": {
    "type": "string",
    "description": "The event or commands to execute.",
    "title": "Commands",
    "oneOf": [
      {
        "pattern": "^@s .+$",
        "title": "Event"
      },
      {
        "pattern": "^/.+$",
        "title": "Command"
      },
      {
        "pattern": "^.+;$",
        "title": "Molang"
      }
    ]
  },
  "A": {
    "type": "string",
    "title": "Molang",
    "description": "Molang definition.",
    "format": "molang",
    "examples": [
      "query.variant",
      "(1.0)",
      "query.",
      "variable.=;"
    ],
    "defaultSnippets": [
      {
        "label": "New Molang",
        "body": "$1"
      }
    ]
  },
  "B": {
    "title": "Format Version",
    "description": "A version that tells minecraft what type of data format can be expected when reading this file.",
    "pattern": "^([1-9]+)\\.([0-9]+)\\.([0-9]+)$",
    "type": "string",
    "default": "1.19.40",
    "examples": [
      "1.19.40",
      "1.18.0",
      "1.17.0",
      "1.16.0",
      "1.15.0",
      "1.14.0",
      "1.13.0",
      "1.12.0",
      "1.10.0",
      "1.8.0"
    ],
    "defaultSnippets": [
      {
        "label": "New Format version",
        "body": "1.${1|8,10,12,17,18,19|}.${3|2|0|}"
      }
    ]
  }
},
"type": "object",
"title": "Animation Controller",
"description": "Animation controller for behaviors.",
"required": [
  "format_version",
  "animation_controllers"
],
"additionalProperties": false,
"properties": {
  "format_version": {
    "$ref": "#/definitions/B"
  },
  "animation_controllers": {
    "type": "object",
    "title": "Animation Controllers",
    "description": "The animation controllers schema for.",
    "propertyNames": {
      "pattern": "^controller\\.animation\\.[a-z\\.]+",
      "examples": [
        "controller.animation.example",
        "controller.animation.example.foo"
      ]
    },
    "additionalProperties": {
      "additionalProperties": false,
      "type": "object",
      "title": "Animation Controller",
      "description": "A single animation controller.",
      "required": [
        "states"
      ],
      "minProperties": 1,
      "properties": {
        "states": {
          "title": "States",
          "description": "The states of this animation controller.",
          "propertyNames": {
            "pattern": "[a-z\\.]+"
          },
          "minProperties": 1,
          "type": "object",
          "additionalProperties": {
            "additionalProperties": false,
            "title": "Animation State",
            "description": "Animation state.",
            "type": "object",
            "examples": [
              {
                "animations": [
                  "anim.idle"
                ],
                "transitions": [
                  {
                    "example": "query.is_sheared"
                  }
                ]
              }
            ],
            "properties": {
              "animations": {
                "title": "Animations",
                "description": "The animations definition for.",
                "type": "array",
                "items": {
                  "$ref": "#/definitions/animationspec",
                  "description": "The key definition of an animation to play, defined in the entity.",
                  "title": "Animations"
                }
              },
              "on_entry": {
                "type": "array",
                "description": "Events, commands or transitions to preform on entry of this state.",
                "title": "On Entry",
                "items": {
                  "$ref": "#/definitions/commands"
                }
              },
              "on_exit": {
                "type": "array",
                "description": "Events, commands or transitions to preform on exit of this state.",
                "title": "On Exit",
                "items": {
                  "$ref": "#/definitions/commands"
                }
              },
              "transitions": {
                "title": "Transition",
                "description": "The transition definition for.",
                "minProperties": 1,
                "type": "array",
                "items": {
                  "title": "Transition",
                  "description": "A transition to another state.",
                  "type": "object",
                  "maxProperties": 1,
                  "minProperties": 1,
                  "examples": [
                    {
                      "default": "query.is_chested"
                    }
                  ],
                  "additionalProperties": {
                    "$ref": "#/definitions/A"
                  }
                }
              }
            }
          }
        },
        "initial_state": {
          "title": "Initial State",
          "description": "The state to start with, if not specified state at position 0 in the array is used.",
          "type": "string",
          "examples": [
            "default"
          ]
        }
      }
    }
  }
}
}

error on schemas with type omitted

Run current code on this schema will panic,

https://github.com/Marwes/schemafy/blob/master/tests/one-of-types.json

Does typify support oneOf?

Presence of maxLength in enum property causes schema build failures

It seems that the presence of the maxLength field in a string enum property causes the addition of extra super:: keywords in the codegen file. Below is the failing schema, the compile error, and a schema with the maxLength field removed that successfully builds.

failing schema:

{
  "$id": "https://example.com/arrays.schema.json",
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "example title",
  "type": "object",
  "properties": {
    "enum_property": {
      "description": "name",
      "type": "string",
      "maxLength" : 32,
      "enum": ["first", "second"]
    }
  }
}

compile error:

error[E0433]: failed to resolve: there are too many leading `super` keywords
   |
26 |             super::ExampleTitleEnumPropertyInner("first".to_string()),
   |             ^^^^^ there are too many leading `super` keywords

error[E0433]: failed to resolve: there are too many leading `super` keywords
   |
27 |             super::ExampleTitleEnumPropertyInner("second".to_string()),
   |             ^^^^^ there are too many leading `super` keywords

Changes required to write schema

{
  "$id": "https://example.com/arrays.schema.json",
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "example title",
  "type": "object",
  "properties": {
    "enum_property": {
      "description": "name",
      "type": "string",
-     "maxLength" : 32,
      "enum": ["first", "second"]
    }
  }
}

Option to disable internal tagging

https://serde.rs/enum-representations.html#internally-tagged

Internal tagging has a side-effect that I need to avoid, and it would be great if it could be disabled in the TypeSpaceSettings.

I have a oneOf that looks like

    ReportData:
      oneOf:
        - $ref: "./ReportType1.yaml#/components/schemas/ReportType1"
        - $ref: "./ReportType2.yaml#/components/schemas/ReportType2"
        - $ref: "./ReportType3.yaml#/components/schemas/ReportType3"
        - $ref: "./ReportType4.yaml#/components/schemas/ReportType4"

each of them looks like

    ReportType1:
      type: object
      required:
        - report_type
        - other
        - stuff
      properties:
        report_type:
          type: string
          default: type1
          enum:
            - type1
        ...

This generates

    #[serde(tag = "report_type")]
    pub enum ReportData {
        #[serde(rename = "type1")]
        ReportType1 { .. },
        #[serde(rename = "type2")]
        ReportType2 { ...  },
        #[serde(rename = "type3")]
        ReportType3 {  ... },
        #[serde(rename = "type4")]
        ReportType4 { ... },
    }

The problem is that all of the members of ReportType1 are inlined into the enum variant, except for the report_type member. Likewise for ReportType2, etc. i.e. the enum does not refer to the struct for ReportType1, but it instead creates a enum variant that is similar but omits the report_type field.

I appreciate this is appropriate for many usages, but we would prefer to have the enum variants use the described types as-is, without internal tagging, knowing the negatives of external tagging.

Is there is something already which allows control of this? (with_struct_builder looks interesting..). If not, what would be the desirable approach to allow this, so we can build it in an acceptable/mergable PR.

JSON Schema fails to codegen with "called `Option::unwrap()` on a `None` value" panic

I am not sure where to get started with troubleshooting the issue. When running cargo typify or calling type_space.to_stream() in the code I get:

thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', /Users/mlieberman/.cargo/registry/src/index.crates.io-6f17d22bba15001f/typify-impl-0.0.12/src/defaults.rs:322:18

This is on the following JSON Schema which does parse for me as a valid schema in other tools and libraries:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "InTotoStatementV1",
  "description": "Represents an In-Toto v1 statement.",
  "type": "object",
  "required": [
    "_type",
    "predicate",
    "predicateType",
    "subject"
  ],
  "properties": {
    "_type": {
      "type": "string",
      "format": "uri"
    },
    "predicate": {
      "$ref": "#/definitions/Predicate"
    },
    "predicateType": {
      "type": "string",
      "format": "uri"
    },
    "subject": {
      "type": "array",
      "items": {
        "$ref": "#/definitions/Subject"
      }
    }
  },
  "definitions": {
    "BuildDefinition": {
      "description": "A structure representing the build definition of the SLSA Provenance v1 Predicate.",
      "type": "object",
      "required": [
        "buildType",
        "externalParameters",
        "internalParameters",
        "resolvedDependencies"
      ],
      "properties": {
        "buildType": {
          "type": "string",
          "format": "uri"
        },
        "externalParameters": true,
        "internalParameters": true,
        "resolvedDependencies": {
          "type": "array",
          "items": {
            "$ref": "#/definitions/ResourceDescriptor"
          }
        }
      }
    },
    "Builder": {
      "description": "A structure representing the builder information of the SLSA Provenance v1 Predicate.",
      "type": "object",
      "required": [
        "id"
      ],
      "properties": {
        "builderDependencies": {
          "default": null,
          "type": [
            "array",
            "null"
          ],
          "items": {
            "$ref": "#/definitions/ResourceDescriptor"
          }
        },
        "id": {
          "type": "string",
          "format": "uri"
        },
        "version": {
          "type": [
            "string",
            "null"
          ]
        }
      }
    },
    "DigestSet": {
      "description": "Represents a set of digests, mapping algorithms to their respective digest strings.",
      "type": "object",
      "additionalProperties": {
        "type": "string"
      }
    },
    "Metadata": {
      "description": "A structure representing the metadata of the SLSA Provenance v1 Predicate.",
      "type": "object",
      "required": [
        "invocationId",
        "startedOn"
      ],
      "properties": {
        "finishedOn": {
          "type": [
            "string",
            "null"
          ],
          "format": "date-time"
        },
        "invocationId": {
          "type": "string"
        },
        "startedOn": {
          "type": "string",
          "format": "date-time"
        }
      }
    },
    "Predicate": {
      "description": "An enum representing different predicate types.\n\nKnown predicate types have their own variants, while unknown types are represented by the `Other` variant, which stores the raw JSON value.\n\nTODO(mlieberman85): Support (de)serializing the predicates based on the predicateType URL in the statement.",
      "anyOf": [
        {
          "$ref": "#/definitions/SLSAProvenanceV1Predicate"
        },
        true
      ]
    },
    "ResourceDescriptor": {
      "description": "A structure representing a resource descriptor in the SLSA Provenance v1 Predicate.",
      "type": "object",
      "required": [
        "uri"
      ],
      "properties": {
        "annotations": true,
        "content": {
          "default": null,
          "type": "string"
        },
        "digest": {
          "type": [
            "object",
            "null"
          ],
          "additionalProperties": {
            "type": "string"
          }
        },
        "downloadLocation": {
          "default": null,
          "type": "string",
          "format": "uri"
        },
        "mediaType": {
          "type": [
            "string",
            "null"
          ]
        },
        "name": {
          "type": [
            "string",
            "null"
          ]
        },
        "uri": {
          "type": "string",
          "format": "uri"
        }
      }
    },
    "RunDetails": {
      "description": "A structure representing the run details of the SLSA Provenance v1 Predicate.",
      "type": "object",
      "required": [
        "builder",
        "metadata"
      ],
      "properties": {
        "builder": {
          "$ref": "#/definitions/Builder"
        },
        "byproducts": {
          "default": null,
          "type": [
            "array",
            "null"
          ],
          "items": {
            "$ref": "#/definitions/ResourceDescriptor"
          }
        },
        "metadata": {
          "$ref": "#/definitions/Metadata"
        }
      }
    },
    "SLSAProvenanceV1Predicate": {
      "description": "A structure representing the SLSA Provenance v1 Predicate.",
      "type": "object",
      "required": [
        "buildDefinition",
        "runDetails"
      ],
      "properties": {
        "buildDefinition": {
          "$ref": "#/definitions/BuildDefinition"
        },
        "runDetails": {
          "$ref": "#/definitions/RunDetails"
        }
      }
    },
    "Subject": {
      "description": "Represents a subject in an In-Toto v1 statement.",
      "type": "object",
      "required": [
        "digest",
        "name"
      ],
      "properties": {
        "digest": {
          "$ref": "#/definitions/DigestSet"
        },
        "name": {
          "type": "string"
        }
      }
    }
  }
}

I have also printed out the debug of the TypeSpace itself which is quite long so here's the linked gist: https://gist.github.com/mlieberman85/5073baf828b74a55183b6c386a476036

Is there anything I should take a look at here to help troubleshoot? Thanks.

command line tool to output rust code [HELP WANTED]

It would be great to have a command line tool in the crate that we could run to output a rust file from JSON schema input. Ideally cargo run within the repo would invoke this by default (i.e. w/o --bin whatever).

Perhaps it would be nice to call it cargo-typify and let users cargo install cargo-typify to run from outside of the repo.

Issue with schema props

.1.0 (C:\Users\miner\OneDrive\Documents\GitHub\Androecia\FriendConnect-rs\minecraft-bedrock-schemas-rs)`

Caused by:
process didn't exit successfully: C:\Users\miner\OneDrive\Documents\GitHub\Androecia\FriendConnect-rs\minecraft-bedrock-schemas-rs\target\debug\build\minecraft-bedrock-schemas-89cec1978fa2cbc3\build-script-build (exit code: 101)
--- stdout
Writing to: .\src\behavior\animations\animations.rs
Writing to: .\src\behavior\animation_controllers\animation_controller.rs

--- stderr
thread 'main' panicked at 'not yet implemented: invalid (or unexpected) schema:
SchemaObject {
metadata: Some(
Metadata {
id: None,
title: Some(
"Transition",
),
description: Some(
"The transition definition for.",
),
default: None,
deprecated: false,
read_only: false,
write_only: false,
examples: [],
},
),
instance_type: Some(
Single(
Array,
),
),
format: None,
enum_values: None,
const_value: None,
subschemas: None,
number: None,
string: None,
array: Some(
ArrayValidation {
items: Some(
Single(
Object(
SchemaObject {
metadata: Some(
Metadata {
id: None,
title: Some(
"Transition",
),
description: Some(
"A transition to another state.",
),
default: None,
deprecated: false,
read_only: false,
write_only: false,
examples: [
Object {
"default": String("query.is_chested"),
},
],
},
),
instance_type: Some(
Single(
Object,
),
),
format: None,
enum_values: None,
const_value: None,
subschemas: None,
number: None,
string: None,
array: None,
object: Some(
ObjectValidation {
max_properties: Some(
1,
),
min_properties: Some(
1,
),
required: {},
properties: {},
pattern_properties: {},
additional_properties: Some(
Object(
SchemaObject {
metadata: None,
instance_type: None,
format: None,
enum_values: None,
const_value: None,
subschemas: None,
number: None,
string: None,
array: None,
object: None,
reference: Some(
"#/definitions/A",
),
extensions: {},
},
),
),
property_names: None,
},
),
reference: None,
extensions: {},
},
),
),
),
additional_items: None,
max_items: None,
min_items: None,
unique_items: None,
contains: None,
},
),
object: Some(
ObjectValidation {
max_properties: None,
min_properties: Some(
1,
),
required: {},
properties: {},
pattern_properties: {},
additional_properties: None,
property_names: None,
},
),
reference: None,
extensions: {},
}', typify\typify-impl\src\convert.rs:551:36

{
  "$schema": "http://json-schema.org/draft-07/schema",
  "$id": "blockception.minecraft.behavior.animation_controller",
  "examples": [
    {
      "format_version": "1.19.0",
      "animation_controllers": {
        "controller.animation.example": {
          "initial_state": "default",
          "states": {
            "default": {
              "transitions": [
                {
                  "state_1": "query.is_baby"
                }
              ]
            },
            "state_1": {}
          }
        }
      }
    }
  ],
  "definitions": {
    "animationspec": {
      "anyOf": [
        {
          "title": "Animation Specification",
          "description": "A single string that specifies which animation there are.",
          "type": "string"
        },
        {
          "type": "object",
          "title": "Animation Specification",
          "description": "A object specification on when to animate.",
          "maxProperties": 1,
          "minProperties": 1,
          "additionalProperties": {
            "$ref": "#/definitions/A"
          }
        }
      ]
    },
    "particle_effect_spec": {
      "additionalProperties": false,
      "type": "object",
      "required": [
        "effect"
      ],
      "properties": {
        "bind_to_actor": {
          "type": "boolean",
          "title": "Bind To Actor",
          "description": "Set to false to have the effect spawned in the world without being bound to an actor (by default an effect is bound to the actor).",
          "const": false
        },
        "effect": {
          "type": "string",
          "title": "Effect",
          "description": "The name of a particle effect that should be played."
        },
        "locator": {
          "type": "string",
          "title": "Locator",
          "description": "The name of a locator on the actor where the effect should be located."
        },
        "pre_effect_script": {
          "type": "string",
          "title": "Pre Effect Script",
          "description": "A molang script that will be run when the particle emitter is initialized."
        }
      }
    },
    "commands": {
      "type": "string",
      "description": "The event or commands to execute.",
      "title": "Commands",
      "oneOf": [
        {
          "pattern": "^@s .+$",
          "title": "Event"
        },
        {
          "pattern": "^/.+$",
          "title": "Command"
        },
        {
          "pattern": "^.+;$",
          "title": "Molang"
        }
      ]
    },
    "A": {
      "type": "string",
      "title": "Molang",
      "description": "Molang definition.",
      "format": "molang",
      "examples": [
        "query.variant",
        "(1.0)",
        "query.",
        "variable.=;"
      ],
      "defaultSnippets": [
        {
          "label": "New Molang",
          "body": "$1"
        }
      ]
    },
    "B": {
      "title": "Format Version",
      "description": "A version that tells minecraft what type of data format can be expected when reading this file.",
      "pattern": "^([1-9]+)\\.([0-9]+)\\.([0-9]+)$",
      "type": "string",
      "default": "1.19.40",
      "examples": [
        "1.19.40",
        "1.18.0",
        "1.17.0",
        "1.16.0",
        "1.15.0",
        "1.14.0",
        "1.13.0",
        "1.12.0",
        "1.10.0",
        "1.8.0"
      ],
      "defaultSnippets": [
        {
          "label": "New Format version",
          "body": "1.${1|8,10,12,17,18,19|}.${3|2|0|}"
        }
      ]
    }
  },
  "type": "object",
  "title": "Animation Controller",
  "description": "Animation controller for behaviors.",
  "required": [
    "format_version",
    "animation_controllers"
  ],
  "additionalProperties": false,
  "properties": {
    "format_version": {
      "$ref": "#/definitions/B"
    },
    "animation_controllers": {
      "type": "object",
      "title": "Animation Controllers",
      "description": "The animation controllers schema for.",
      "propertyNames": {
        "pattern": "^controller\\.animation\\.[a-z\\.]+",
        "examples": [
          "controller.animation.example",
          "controller.animation.example.foo"
        ]
      },
      "additionalProperties": {
        "additionalProperties": false,
        "type": "object",
        "title": "Animation Controller",
        "description": "A single animation controller.",
        "required": [
          "states"
        ],
        "minProperties": 1,
        "properties": {
          "states": {
            "title": "States",
            "description": "The states of this animation controller.",
            "propertyNames": {
              "pattern": "[a-z\\.]+"
            },
            "minProperties": 1,
            "type": "object",
            "additionalProperties": {
              "additionalProperties": false,
              "title": "Animation State",
              "description": "Animation state.",
              "type": "object",
              "examples": [
                {
                  "animations": [
                    "anim.idle"
                  ],
                  "transitions": [
                    {
                      "example": "query.is_sheared"
                    }
                  ]
                }
              ],
              "properties": {
                "animations": {
                  "title": "Animations",
                  "description": "The animations definition for.",
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/animationspec",
                    "description": "The key definition of an animation to play, defined in the entity.",
                    "title": "Animations"
                  }
                },
                "on_entry": {
                  "type": "array",
                  "description": "Events, commands or transitions to preform on entry of this state.",
                  "title": "On Entry",
                  "items": {
                    "$ref": "#/definitions/commands"
                  }
                },
                "on_exit": {
                  "type": "array",
                  "description": "Events, commands or transitions to preform on exit of this state.",
                  "title": "On Exit",
                  "items": {
                    "$ref": "#/definitions/commands"
                  }
                },
                "transitions": {
                  "title": "Transition",
                  "description": "The transition definition for.",
                  "minProperties": 1,
                  "type": "array",
                  "items": {
                    "title": "Transition",
                    "description": "A transition to another state.",
                    "type": "object",
                    "maxProperties": 1,
                    "minProperties": 1,
                    "examples": [
                      {
                        "default": "query.is_chested"
                      }
                    ],
                    "additionalProperties": {
                      "$ref": "#/definitions/A"
                    }
                  }
                }
              }
            }
          },
          "initial_state": {
            "title": "Initial State",
            "description": "The state to start with, if not specified state at position 0 in the array is used.",
            "type": "string",
            "examples": [
              "default"
            ]
          }
        }
      }
    }
  }
}

default on enum can causes TypeError(InvalidValue)

I have a failing input OAS being emitted from a https://github.com/tiangolo/fastapi service, which I've trimmed down to this sample

openapi: 3.0.2
info:
    title: FastAPI
    version: 0.1.0
paths: {}
components:
    schemas:
        A:
            properties:
                version:
                    title: Version
                    enum:
                        - foo
                    type: string
        B:
            properties:
                version:
                    title: Version
                    enum:
                        - bar
                    type: string
                    default: bar

This fails in progenitor with

gen fail: TypeError(InvalidValue)
Error: generation experienced errors

This fails at https://github.com/oxidecomputer/typify/blob/d579a52/typify-impl/src/defaults.rs#L82

Remove the default: bar and the error goes away.

Part of the cause is the title: Version being common on two different enums.

consider a different handling of `allOf` constructs

Currently an allOf results in a struct with the component schemas generated and flattened. This is often fine; it has the nice attribute that subordinate types are present in the final type. It can also be imprecise or awkward to use the generated type as the relevant components may be scattered around.

Another way to handle these schemas would be to first "consolidate" all members of the allOf construct. This would effectively mean taking the intersection of all validation criteria (paying attention to the value of additionalProperties). In order to preserve the relationship with the original component types, we could impl From<AllOfType> for ComponentType { ... } and allow users to convert between them (or clone().into::<ComponentType>() to do so non-destructively).

This could be an interesting use case to consider: https://github.com/Marwes/debugserver-types/blob/master/src/schema.json

In particular, we'd want to see a type like:

struct ContinuedEvent {
    event: Continued, // enum Continued { Continued }
    body: ContinuedEventBody,
    type_: EventType, // enum EventType { Event }
    seq: i64
}

impl From<ContinuedEvent> for Event {
    fn from(value: ConteinuedEvent) -> Event {
        Event {
            event: value.event.to_string(),
            body: value.body.to_value(),
            type_: value.type_,
            seq: value.seq,
        }
    }
}

This would likely focus on TypeSpace::maybe_all_of_subclass

special case `null` in `convert_unknown_enum`

a schema such as

{
  "enum": [
    null,
    "multiply",
    "screen",
    "overlay",
    "darken",
    "lighten",
    "color-dodge",
    "color-burn",
    "hard-light",
    "soft-light",
    "difference",
    "exclusion",
    "hue",
    "saturation",
    "color",
    "luminosity"
  ]
}

produces an error like this:

thread 'test_other' panicked at 'multiple implied types for an un-typed enum {String, Null} [Null, String("multiply"), String("screen"), String("overlay"), String("darken"), String("lighten"), String("color-dodge"), String("color-burn"), String("hard-light"), String("soft-light"), String("difference"), String("exclusion"), String("hue"), String("saturation"), String("color"), String("luminosity")]', typify-impl/src/convert.rs:1109:18

This isn't necessary: we can handle an enum that include null specially as an optional type.

support GitHub webhook schema

GitHub have a facility whereby they will make HTTP POST requests to a URL you nominate for your application or repository, with a large and complex payload. Fortunately there appears to be a JSON Schema now for that payload:

https://github.com/octokit/webhooks/tree/master/payload-schemas

Though the source in the repository contains many small fragments of JSON schema, it appears they publish an amalgamated document at:

https://unpkg.com/@octokit/webhooks-schemas/schema.json

At the moment, attempting to use that file with typify results in a proc macro panic:

error: proc macro panicked
 --> server/src/hooktypes.rs:3:1
  |
3 | import_types!("../schema/webhooks.json");
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = help: message: assertion failed: validation.additional_properties.is_none()

Enum Name Collision

In the Vega example, an enum with __count__ and count outputs:

pub enum AggregateTransformOpsVariant0ItemVariant0 {
    #[serde(rename = "count")]
    Count,
    #[serde(rename = "__count__")]
    Count,
   // ...
}

I'm looking into whether __count__ is a spec bug and not meant to be exposed ๐Ÿ˜… , but from a code gen perspective figuring out what to do with these collisions is probably worthwhile.

integer type selection is wonky

For example, this schema:

{                                       
  "type": "integer",
  "minimum": 0
}                                       

... translates to an i8. This is very very wrong. A u8 would be reasonable, but a u64 would be much better.

Failing schema: unhandled array validation

The schema is here: https://github.com/jgm/djot.js/blob/baa4d3e62bccaebe0680adea1b185adbd211fd8a/djot-schema.json , and is AFAIK automatically generated using typescript-json-schema.

The output of cargo typify is

cargo typify --output src/types.rs djot.js/djot-schema.json
Error: 
   0: Failed to convert JSON Schema to Rust code
   1: Could not add ref types from the 'definitions' field in the JSON Schema
   2: schema invalid: unhandled array validation ArrayValidation {
          items: Some(
              Vec(
                  [
                      Object(
                          SchemaObject {
                              metadata: None,
                              instance_type: None,
                              format: None,
                              enum_values: None,
                              const_value: None,
                              subschemas: None,
                              number: None,
                              string: None,
                              array: None,
                              object: None,
                              reference: Some(
                                  "#/definitions/Caption",
                              ),
                              extensions: {},
                          },
                      ),
                  ],
              ),
          ),
          additional_items: Some(
              Object(
                  SchemaObject {
                      metadata: None,
                      instance_type: None,
                      format: None,
                      enum_values: None,
                      const_value: None,
                      subschemas: None,
                      number: None,
                      string: None,
                      array: None,
                      object: None,
                      reference: Some(
                          "#/definitions/Row",
                      ),
                      extensions: {},
                  },
              ),
          ),
          max_items: None,
          min_items: Some(
              1,
          ),
          unique_items: None,
          contains: None,
      }

look through references to improve externally tagged `enum` generation

This is a possible solution to some of the gnarly Vega code generation. Several of the more complex types end up with many VariantN layers (like @ahl mentioned in #165).

For example we have:

#[derive(Clone, Debug, Deserialize, Serialize)]
#[serde(untagged)]
pub enum AlignValueVariant0ItemSubtype1Subtype1Subtype0 {
    Variant0(SignalRef),
    Variant1 {
        value: AlignValueVariant0ItemSubtype1Subtype1Subtype0Variant1Value,
    },
    Variant2 {
        field: Field,
    },
    Variant3 {
        range: AlignValueVariant0ItemSubtype1Subtype1Subtype0Variant3Range,
    },
}

It might be okay to replace instances of that enum with something like Box<dyn AsVegaAlign>.

test fail

  • rustc 1.59.0-nightly (efec54529 2021-12-04)
  • fresh clone
  • cargo test (stable or nightly) :
running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running unittests (target/debug/deps/typify_impl-9179ec6858f40f9a)

running 44 tests
test convert::tests::test_int_i8 ... ok
test convert::tests::test_int_i16 ... ok
test convert::tests::test_int_i64 ... ok
test convert::tests::test_int_i32 ... ok
test convert::tests::test_int_nonzerou32 ... ok
test convert::tests::test_int_nonzerou16 ... ok
test convert::tests::test_int_nonzerou8 ... ok
test convert::tests::test_int_u16 ... ok
test convert::tests::test_int_u64 ... ok
test convert::tests::test_int_nonzerou64 ... ok
test convert::tests::test_int_u32 ... ok
test convert::tests::test_set ... ignored
test convert::tests::test_int_u8 ... ok
test convert::tests::test_redundant_types ... ok
test enums::tests::test_adjacently_tagged_enum ... ok
test enums::tests::test_enum_detection_untagged ... ok
test enums::tests::test_externally_tagged_enum ... ok
test enums::tests::test_adjacently_tagged_enum_output ... ok
test enums::tests::test_externally_tagged_enum_output ... ok
test enums::tests::test_internally_tagged_enum ... ok
test enums::tests::test_internal_deny_simple ... ok
test enums::tests::test_result ... ok
test enums::tests::test_maybe_option_as_enum ... ok
test enums::tests::test_head_fake_tagged_enum ... ok
test enums::tests::test_internally_tagged_enum_output ... ok
test enums::tests::test_untagged_enum ... ok
test enums::tests::test_simple_untagged_enum ... ok
test structs::tests::test_less_simple_struct ... ok
test structs::tests::test_object_no_validation ... ok
test enums::tests::test_untagged_enum_output ... ok
test structs::tests::test_simple_struct ... ok
test tests::test_convert_enum_string ... ok
test tests::test_alias ... ok
test structs::tests::test_some_maps ... ok
test tests::test_string_enum_with_null ... ok
test type_entry::tests::test_ident ... ok
test util::tests::test_exclusive_one_empty_struct ... ok
test util::tests::test_exclusive_simple_arrays ... ok
test util::tests::test_exclusive_structs ... ok
test util::tests::test_non_exclusive_structs ... ok
test util::tests::test_sanitize ... ok
test tests::test_simple ... ok
test util::tests::test_unique_prop_structs ... ok
test structs::tests::test_flatten_stuff ... FAILED

failures:

---- structs::tests::test_flatten_stuff stdout ----
RootSchema {
    meta_schema: Some(
        "http://json-schema.org/draft-07/schema#",
    ),
    schema: SchemaObject {
        metadata: Some(
            Metadata {
                id: None,
                title: Some(
                    "FlattenStuff",
                ),
                description: None,
                default: None,
                deprecated: false,
                read_only: false,
                write_only: false,
                examples: [],
            },
        ),
        instance_type: Some(
            Single(
                Object,
            ),
        ),
        format: None,
        enum_values: None,
        const_value: None,
        subschemas: None,
        number: None,
        string: None,
        array: None,
        object: Some(
            ObjectValidation {
                max_properties: None,
                min_properties: None,
                required: {
                    "number",
                },
                properties: {
                    "number": Object(
                        SchemaObject {
                            metadata: None,
                            instance_type: Some(
                                Single(
                                    Integer,
                                ),
                            ),
                            format: Some(
                                "int32",
                            ),
                            enum_values: None,
                            const_value: None,
                            subschemas: None,
                            number: None,
                            string: None,
                            array: None,
                            object: None,
                            reference: None,
                            extensions: {},
                        },
                    ),
                },
                pattern_properties: {},
                additional_properties: None,
                property_names: None,
            },
        ),
        reference: None,
        extensions: {},
    },
    definitions: {},
}
#[derive(Serialize, Deserialize, Debug, Clone)]
pub struct FlattenStuff {
    pub number: i32,
}

thread 'structs::tests::test_flatten_stuff' panicked at 'lengths don't match: 2 != 1', typify-impl/src/structs.rs:460:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    structs::tests::test_flatten_stuff

test result: FAILED. 42 passed; 1 failed; 1 ignored; 0 measured; 0 filtered out; finished in 0.09s

error: test failed, to rerun pass '-p typify-impl --lib'

ignore irrelevant validation for typed schemas

When a schema contains a type field (and we'll only consider a single item in that field for the purpose of this issue), the JSON Schema spec says that validation can ignore extraneous fields. That means that an object can ignore, say, an items field relevant only for array types, or a minimum field, relevant only for numeric types.

Currently we produce an error in those situations, but we can be more permissive while still abiding by the spec. While these types of constructs do seem "linty", we can leave it to other tools to identify and handle this sort of lint.

Major rewrite notes

In the course of building and enhancing typify I've found many shortcomings of the current design. This issue is to catalogue those observations and ideas.

Moving away from numeric IDs

Managing a numeric ID space, allocating IDs, looking up by IDs... it's not very Rust-y. I think we could do something far more idiomatic by referencing types with an Rc<RefCell<TypeEntry>> -- we could call it a TypeRef. Anywhere we currently have a TypeId embedded within a TypeEntry (its details and variants, etc.) would become a TypeRef. This would allow for two things: direct use of the referenced type (i.e. we would not need to pass around the TypeSpace everywhere to do lookups), and (so-called) "interior-mutability" by which we could modify types as we walk the graph. The latter would be helpful e.g. when resolving type-containment cycles.

Regarding type-containment cycles. They're bad. Why? Because the generated code won't build. We need to break them by replacing some of the types with Box<T> of the type. With the RefCell we could borrow_mut() and break the cycles as we find them. Without it, we would need to effectively discover all splices required, return out a list, and then process the list (i.e. once no longer holding the shared references required when talking the type graph).

Type states

We have some (many?) implicit assumptions floating around. Some that are the most subtle are the implied constraints when processing a collection of types that may be the target of a $ref construct. In particular, while we can resolve references (lookup by name in a string -> type mapping) we can't inspect the referenced TypeEntry. In current terms this means we can't self.id_to_entry.get(type_id).unwrap() because we may not yet have converted the referenced type. In the RefCell world above, it means that we may not borrow() or borrow_mut() on the referenced type because we might either get the converted type or a placeholder reference (the latter of which would be RefCell::replaced).

Ideally we would enforce these sorts of constraints using the type system. We need to be thoughtful and careful about this design so that we don't end up copying huge amounts of state.

Note that we can avoid some of this if the type graph is acyclic by topo-sorting, but ideally we'd like to support type graphs that contain reference cycles.

Type bundles / transactions

Related to type states, we might think about the conversion process in bundles or transactions: whether for a single schema or a dictionary of schemas (i.e. references), we could build up a bundle of types, navigate them through various states, and then apply them to the TypeSpace. Note that these bundles could (and would) contain references (TypeId or TypeRef) to extant types, and that would be fine.

This could help clean up error semantics: if a schema conversion failed, the TypeSpace could be kept in a known, consistent, unperturbed state rather than potentially containing some of a converted dictionary (for example).

Separate mutable and immutable structures

This may be less of a concern if we move to the interior mutable of TypeRef, but currently we can get snagged on shared references to TypeSpace::definitions while passing around an exclusive reference during conversions. We could avoid this by separating out mutable and immutable state as we pass it around various functions.

This can happen, for example, if we "resolve" a schema i.e. pick it apart and chase references. This process results in the TypeSpace being borrowed (via the borrow on the Schema) which means we can't get the exclusive reference to e.g. add new types.

Conversion on Schemas only

This is already mostly true, but we should enforce that the conversion step only deals with raw schemas and dictionaries, but (as noted above) prohibit this pass from looking at processed types (which may not yet be fully processed).

We should have as many passes as we want, moving between different IRs; each pass should be forced to look only at structures that are stable during the pass. We should not look at structures being actively mutated during the pass.

Optional dependencies

Right now we assume that users are fine using chrono and uuid. We might add some other dependencies, for example one for constrained numeric types. It would be cool if users could input the list of dependencies they're willing to have. Indeed, if schemars or something similar included an extension to identify a type, we could even automatically sub in types from an allow-list of crates.

Naming and defaults

Naming and defaults are both inconsistently and verbosely handled. We get a bit confused because most conversion functions return both metadata and a type. We use the metadata e.g. to add docs to a struct property. We should evaluate metadata that's directly applicable to the generated type and then pass out metadata explicitly and exclusively for use e.g. to construct struct properties. Note that these may have some overlap, for example (maybe) with defaults.

Variant names

In particular for untagged enums, it can be challenging to infer a good variant name when initially decoding the schema. It might be reasonable to add an explicit pass through the graph where we revisit variant naming. This pass would likely happen before containment cycle elimination.

Closer-to-zero copy

We don't need to get crazy about zero copy, but we can do more to copy less. In particular, it would be cool to have types retain their JSON schema information without needing to copy it around everywhere. This is relevant for both debugging and generation e.g. of a JsonSchema impl or doc comments.

IR / Canonical forms

In particular to handle different schema types, we should have an intermediate representation that is slightly less redundant than base JSON schema. For example, use enum rather than const, eliminate if / then / else in favor of oneOf: [ allOf: [ if, then ], allOf: [ not: [if], else ] ], etc. Beyond that we probably want to take another pass to put things into a canonical form (or perhaps that would be another IR).

Plan for modeling of unusual types

This issue is to track ideas for how we might model JSON Schema types in Rust. The particular goals are to model the types as closely as possible such that only valid states are representable in Rust. Here's an example:

{
  "type": "object",
  "minProperties": 1,
  "maxProperties": 1,
  "additionalProperties": {
    "type": "string"
  }
}

A valid JSON object might look like this:

{ "foo": "bar" }

It is, in other works, a single key-value pair (both strings). Absent the min/max property count constraints, we would model this in Rust as a Map of some kind (e.g. HashMap<String, String>. With those constraints this could be precisely modeled as a tuple of Strings:

struct GeneratedType(String, String);

This is great in that users can only generate values that conform to the schema.

The wrinkle is that the default serialization of SomeType("foo".into(), "bar".into()) would be ["foo", "bar"]. For this type--and others we need to have both a representation in Rust and the proper se/deserialization. One way to address this is with "helper" types that implement the appropriate Ser/De functions:

struct GeneratedType(json_helper_types::ObjectSingleton<String, String>);

Explicit, inline types might be simpler to use and understand, but managing the serialization functions would be more cumbersome. As such I propose the following:

  • Create a new crate (within this workspace?) where we create types that match these special cases
  • Create a setting for typify that lets consumers opt-in or opt-out of its use (opting out would render less-specific, less-precise types)
  • Eventually inline the types and rely on the crate for serde implementations

consider type naming

JSON schema gives us two direct sources of types names: the keys in references dictionary and the title field in schema metadata. Note also that while the former are necessarily unique, the latter may not be. In addition, we can get hints about names. Consider for example a JSON object that looks like this:

{
    "fooObj": {
        "a": 7,
        "b": 8
    }
}

The outer object may have a name, but the value of fooObj may not (or it may have a title that is not unique). We could include a "distinguishing hint" in these cases which might be the enclosing type and property name, or may come from the external caller e.g. our OpenAPI processor could use the operationId.

After processing all types, we could choose to always include the distinguishing hint in the name, only include it if it was needed to resolve conflicting types, or raise an error if it as needed to disambiguate conflicting types.

This would require a change in the API which currently is of the form schemars::schema::Schema -> proc_macro2::TokenStream i.e. we will no longer be sure of the type name until all schemas have been examined.

deny_unknown_fields on a oneOf enum prevents use of flattened additionalProperties

When a subschema uses a oneOf where one variant uses additionalProperties, all objects fail to deserialize claiming that there is no matching variant for the generated untagged enum. Upon inspection, it seems like during a successful generation, the deny_additional_fields attribute is added to the object, which is documented to not play nicely with the flatten attribute. Note that this behavior is not consistently seen on all such enums, so it might be an edge case with my example specifically. Adding an explicit "additionalProperties": true on the object causes generation to fail.

Sample schema fragment:

"References": {
    "oneOf": [
        { 
            "$comment": "An array of mod names. Version is inferred to be @modlinks." ,
            "type": "array", 
            "items": { "type": "string" }
        },
        { 
            "$comment": "Mapping of mod name to the desired version",
            "type": "object", 
            "additionalProperties": 
            { 
                "oneOf": [
                    { "$ref": "#/definitions/StringVersion" },
                    { "$ref": "#/definitions/ReferenceDef" }
                ]
            }
        }
    ]
}

The corresponding generated Rust code is:

#[derive(Clone, Debug, Deserialize, Serialize)]
#[serde(untagged, deny_unknown_fields)] //problem is here, because Variant1.extra is flattened
pub enum References {
    Variant0(Vec<String>),
    Variant1 {
        #[serde(flatten)]
        extra: std::collections::HashMap<String, ReferencesVariant1ExtraValue>,
    },
}

Allow configuration of package dependencies and schema conversions

Currently, typify can generate code that depends on uuid and chrono. It might be useful to enable support for other packages or it might be useful to deny use, say, of uuid and represent a { "type": "string", "format": "uuid" } as a String rather than as an uuid::Uuid. One could even imagine allowing substitutes for the std types for arrays, maps, and sets.

In addition, it could be useful to allow consumers to specify new associations between formats and Rust types or to override existing ones. For example, a user might want to use rust_decimal::Decimal" for the "decimal" format or chrono::naive::NaiveDatefor the "date" format (rather thanchrono::Date`).

We could modify the macro, builder, and command-line interfaces to allow this (perhaps all through a shared Settings type). e.g.

import_types!(
    schema = "../example.json",
    allow_packages = [uuid, chrono],
    additional_string_formats = {
        "decimal" = "rust_decimal::Decimal",
        "date" = "chrono::naive::naiveDate",
    },
);

The application panicked (crashed).

Schema: https://github.com/oasis-tcs/csaf/blob/master/csaf_2.0/json_schema/provider_json_schema.json

Started with:

cargo typify provider_json_schema.json

Results in:

The application panicked (crashed).
Message:  key publisher is missing
Location: /home/abc/.cargo/registry/src/github.com-1ecc6299db9ec823/typify-impl-0.0.12/src/convert.rs:967

Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.

Generated code fails to build due to wrong scoping on "impl Default for <Enum>"

When generating code making use of defaults in enums, the scope of the output default implementation is wrong and prevents the generated code to build correctly.

Steps to Reproduce

Using the following JSON:

{
    "definitions": {
        "test-enum": {
            "type": "string",
            "enum": ["failure", "skipped", "success"],
            "default": "failure"
        }
    }
}

The generated default implementation is the following:

impl Default for TestEnum {
    fn default() -> Self {
        super::TestEnum::Failure
}

Expected bahaviour

The generated default implementation has the correct scope for the enum generated:

impl Default for TestEnum {
    fn default() -> Self {
        TestEnum::Failure
}

or alternatively:

impl Default for TestEnum {
    fn default() -> Self {
        Self::Failure
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.