The parser currently correctly turns proto2
and proto3
into an AST, but does no further processing or semantic validation.
Proposed additions:
Process import
statements
Parser methods need to be enhanced with, e.g., a function parameter that can satisfy includes. For Parse.fromFile
, this could be an include path or paths. How to provide this for strings and streams is an open question.
Normalize Type Names
Convert various naming conventions into a single normalized form. Should support outputting in caller-selectable camelCase, PascalCase, snake_case, etc.
Flatten Type Names
Messages can contain inner message and enum definitions. These are publically visible and can be referenced from other types. It will probably be convenient to flatten these symbols and move them from an inner scope to the top level scope, while renaming the type using the fully-qualified scope name; e.g., "Outer.Inner".
In addition, Record types cannot contain inner types. Since .NET can handle "+" in a symbol name, the names could be normalized from "Outer.Inner" to "Outer+Inner" either in this step, or when doing code-gen for Records.
Validate Type Usage
Ensure every message
used as a field type and enum
used as a constant are defined somewhere. Must handle use before definition (so cannot be single pass). Must also handle qualification with the package
name or unqualified using the current .proto file's package.
Resolve Constants
An enum
can be used as a constant in an Option definition. Consider either resolving these, or exposing the symbol table used during type usage validation so that client code can do the substitution.
Capture Comments
Google's protobuf compiler can capture comments related to a symbol. This can be quite useful for generating documentation, such as in a type provider. It is also needed to create a full FileDescriptorSet (a binary protobuf representation of a set of .proto files).