This is something that I have been thinking about for quite a while, and I think I finally came up with something workable. Sorry about the length, I just wanted to make a complete brain-dump of my thoughts on this for discussion...
Background
Currently, extensions may introduce new types (as in the sense of data
or newtype
in Haskell), but only by forwarding to a new struct/enum/union type. In order to check whether something is an extension type (and possibly extract the values of type parameters), since we are not allowed to directly match on the extension type production due to maintain coherence, we must resort to cumbersome methods (e.g. as seen in the closure extension.) It seems that we could benefit by adding a method of introducing new, non-forwarding productions into the type hierarchy.
Note that deciding to use nonterminals vs. closed nonterminals typically involves some kind of tradeoff - either we can allow arbitrary analyses and restrict new variants to be expressed in terms of the host, or allow arbitrary variants and restrict analyses. However, in this case, the host language already allows arbitrary variants of types, which must be given some form of default treatment by the type system and any added analyses. So permitting arbitrary new types to be introduced by extensions doesn't impose any additional restrictions, since any new analyses on types would have a default in the form of the value given to structs.
As another advantage, having some form of closed types would permit us to return to something along the lines of our prior, somewhat less confusing method of overloading based on higher-order dispatch attributes on types, without suffering any interference issues. Although the most basic rule for coherence on closed nonterminals states that the value for a new attribute on any production should be equivalent to the default value, we can and have found other situations that are exceptions to this, such as in the case of what we did for Def
in order to simulate a non-homogeneous collection structure. In order to evaluate whether a particular case is non-interfering, we simply need to ask if any glue code would be required for it to compose.
If extension E1 were to introduce a new overload attribute A on some form of closed type nonterminal, with a default equation of nothing()
, and extension E2 were to introduce a new production P on this nonterminal, then the default equation for A on P should be nothing()
, since E1 knows nothing about E2 and should not overload that type. Then the default and expected values are the same, so no glue code is required. This is true in general for this type of attribute, and I hope to prove this more rigorously after reading that section of Ted's thesis, although it should be somewhat intuitively clear.
Note that some analyses do not have this property of having a valid default for productions in independent extensions, such as an equality test via isEqualTo
/isEqual
-type attributes, and would need to be included in the host. However, this is the only example of which I can think that would would be affected by this change, due to the presence of struct/enum/union types. For example, adding a new transformation on types similar to lifted
would just leave new extension types unchanged as is the case with structs right now, and currently an analysis for if a type contains a pointer isn't possible since we can`t look inside structs.
Proposal
Although theoretically an option, closing the entire Type
nonterminal would require major changes in lots of places and could complicate how we deal with interconverting type expressions and types, due to C's division into base type and type modifier expressions.
A better option would be to add a new closed nonterminal, ExtType
(for 'extension type' or 'external type' - whatever), and a new production extType :: (Type ::= Qualifiers ExtType)
. ExtType
would have almost all the same attributes as Type
, although most of these would have defaults. In addition, it would have isEqualTo
/isEqual
attributes for checking equality, and the qualifiers would be passed in from extType
via an inherited attribute.
Since every type has a corresponding type expression, extType
would have a corresponding BaseTypeExpr
production, extTypeExpr :: (BaseTypeExpr ::= Qualifers ExtType)
. The transformation between these would be trivial, and host
and pp
on extTypeExpr
would come directly from ExtType
, which would be explicitly defined in every extension ExtType
production. extTypeExpr
would also have no errors
, defs
, freeVariables
, globalDecls
, or typeModifiers
.
Then, for an extension to introduce a new type, it would introduce a new, non-forwarding production on ExtType
, defining at least the equations for pp
, host
, and isEqual
. The extension would also define a BaseTypeExpr
production forwarding to extTypeExpr
wrapping the newly defined ExtType
. Note that extensions could still also define normal, forwarding Type
productions that are equivalent to host types, as we do now.
Since they would have similar semantics, TagType
s (consisting of enumTagType
and refIdTagType
for structs and unions) should probably be moved to being productions on ExtType
instead. This would mean that tagReferenceTypeExpr
would have as its typerep
extType(...)
instead of tagType(...)
, which would be transformed back into extTypeExpr(...)
. This is a non-issue, as these would have the same pp
, and actually would be a slight performance improvement since we would avoid looking up the refId when re-computing the typerep
.
The TagType
nonterminal, as far as I can tell, is only there to keep things better organized, so merging this with ExtType
shouldn't break too many things, and the only major refactoring to do would be to replace all references to tagType
with extType
.
Overloading
Modifying overloading doesn't have to happen at the same time as these changes, but we do eventually want to be able to specify overloads on host types. We have a couple of options here:
- Largely keep the current mechanism, and modify it such that an
ExtType
may specify a module name via an attribute. This would be less work, but more confusing in the long run. Also, with the addition of ExtType
s, being able to specify an overload on a struct type becomes redundant.
- Return to a mechanism similar to what we had previously, utilizing dispatch productions, but non-interfering as I informally proved above. This method would be slightly more work now, but would be less confusing to new users of ableC, and could allow us to do away with some of the magic needed to make
__attribute__((module(...)))
work properly. This approach would also be a bit simpler than it was when first implemented, due to a few lessons learned since then.
@tedinski @ericvanwyk any thoughts?