Giter VIP home page Giter VIP logo

Comments (9)

burnpanck avatar burnpanck commented on June 17, 2024

Hey, first of all, it's great that this is moving forward! I am trying to migrate from the original traits from enthought, since I think their implementation is old, inelegant and very slow to evolve. For me, numpy traits are very important, and while the astropy community has an interesting implementation, I'd prefer a more official trait type from the jupyter project.

In general I find this custom validator registry very elegant and neat. You are right, there is a quite a broad set of validations that one could consider. Nonetheless, I do believe there is a justification for a set of very basic validations, namely shape and dimensionality (and in my view also dtype). The efficiency we have in numpy code is very often a result of the immensely powerful broadcasting rules. Such code however often depends on the input arrays to conform to some shape requirements. I would argue that more than half of the Array traits therefore will end up with a shape validator (in my code it would definitely exceed 99%). I therefore would believe that this justifies a standardized (and ideally more direct) interface to at least a basic set of shape validations (why not the very same set that enthought's array traits supported?). At the very least traittypes should provide these basic validators.

However, these validators lack one important aspect that traits usually provide: Introspection! There is no clear API to programmatically find out what kind of arrays are accepted and/or contained in such a trait. For me, this is one of the very reasons to use traits at all.
My preferred array trait would expose a clear API that describes what kind of array I should expect when I read the corresponding attribute from a HasTraits instance. Note that this corresponds to the output end of your validator pipeline, describing what arrays eventually are held in the trait. This is what I would consider validation in the strict sense, while the input end, transforming whatever you throw at the trait into something that conforms, is what I would consider coercion or adaption (though indeed traits all handle this in a single validation stage). If one would more clearly separate these two concepts, then there is one key benefit: Validators could be considered an implicitly and connected set of assertions which would therefore not depend on their order. (If you really ever need an array that is either two dimensional OR contains integers, then of course you could still implement an explicit or validator). They could then be considered similarly to a tag. i.e. the shape validator(-tag) indicates what shape constraints apply to the value (that is, the effective value, after all input transforms are applied). The more general adapting/coercing validators that actually change the value need to have an order and thus cannot easily be considered independent tags.

I therefore propose the following:

  • Separate adaption and validation registries, where adaption stays an ordered list, and validation behaves as a dictionary mapping some tags (strings or possibly validation classes) to the validation parameters (as a publicly readable API), which are asserted in arbitrary order after adaption is complete.
  • Provide a standard set of basic validators, particularly shape and dtype constraining validators.
  • Optionally, (particularly if the validation dict is not keyed on strings), add keyword arguments to the trait constructor to shortcut access to the most important validators, and similarly provide properties on the trait that allow them to be read.

I'd be happy to attempt an implementation, if there is some basic agreement here.

from traittypes.

SylvainCorlay avatar SylvainCorlay commented on June 17, 2024

Hi @burnpanck thank your for the interest. I will try to answer all your questions!

  • Indeed we don't have a nice introspection mechanism for custom trait validators. I think that if it was to be added, it should probably be in traitlets themselves.
  • I do have the intention to add common validators like shape in traittypes, but I did not do so immediately so as to have the possibility to improve them before committing to an API. I was also waiting to see how the migration of bqplot to traittypes went before adding more validators withing traittypes.

(I am also waiting for some feedback from @astrofrog on this API since he gave a shot at numpy array traits and has different requirements from me.)

  • The three following validators: shape, squeeze and dimension_bounds are all candidates do be part of a traittypes.array_validators submodule.
  • The separation between coercion and validation is probably something we would do now if we were to rewrite traitlets from scratch but unfortunately they are completely inseparable now!
  • The point on the keyword argument to access common validators is the only one I don't agree with. In the case of numpy array, there are so many possible validation that could occur that are all "natural" that having keyword arguments for each one of them is likely to bloat the trait type.

The current approach valid(shape(2, 2), subdtype('datetime64')) is already quite short isn't it?

from traittypes.

burnpanck avatar burnpanck commented on June 17, 2024

Thanks for the clarifications. I would still like to lobby a bit more for my proposal (at least some elements of it). Maybe my main point is really that there is a basic set of validations that one should not call custom at all. Most other traits never need any custom validation, the trait itself offers enough functionality. While that API is of course not official, I can easily read from a CaselessStrEnum what options are allowed. In that sense there is introspection for some basic validation. On the other hand, an Array without any further validation offers nothing more than Instance(np.ndarray). I therefore argue that every single instance of that trait will have some validators set, and most often it will be among the three you mentioned.

Let us take inspiration from the container traits: A Tuple trait neither offers much more than Instance(tuple). But it offers some basic additional validation functionality giving it extra value, without offering every conceivable constraint that you could imagine. This way, it also offers some introspection for the constraints, but of course none for custom coercion+validation. In that sense, traitlets does not separate custom coercion from custom validation, but there is a clear set of standard validation on each trait including usable introspection (even if it is not formally defined).

My proposed (strict) validation registry would offer some hybrid functionality, in on the one hand offering introspection for the validation part, but still allowing customisability. This is not that important to me. I'd be perfectly happy with a fixed, trait defined set of standard validators. They don't have to cover all use-cases, one can add custom validation to get more fine grained control to individual traits. But there should be a significant subset of the use-cases of the Array trait where no custom validator is required to add enough functionality justifying it's use.

If it is a fixed set of validations, then keyword arguments would probably be natural. Other than that, I do not mind a JavaScript style method chaining interface, indeed the current approach is short enough.

from traittypes.

SylvainCorlay avatar SylvainCorlay commented on June 17, 2024

I dropped the earlier approach of kwargs-specified validators for numpy array because I wanted the content of traittypes not to be opinionated on what should be in a shortlist and be extensible and pluggable instead.

Besides, such a short list would likely conflict with other flavors of the same validators passed to valid. There would be cases where you want the registered validators to run after or before the registered kwargs-specified list etc...

from traittypes.

burnpanck avatar burnpanck commented on June 17, 2024

Extensible and pluggable is a good thing. But that does not mean that the most basic functionality cannot be offered more directly, precisely because the functionality can still be extended (i.e. you can always narrow down the accepted values in custom validators). On the other hand, the basic set will not conflict with custom validators passed to valid, since the custom validators should always be applied first. Again, it's the same with all the other standard traits.

from traittypes.

SylvainCorlay avatar SylvainCorlay commented on June 17, 2024

I don't see people agreeing on what is basic

  • subdtype
  • having a specific shape
  • being square
  • dimension bounds
  • squeezing dimensions of length one
  • being broadcastable with a certain list of shapes.

Etc.

In any case there is little benefit of writing

Array(shape=(2, 2))

Over

Array().valid(shape(2, 2))

In the case where theree
are multiple validations the former does not allow to specify the order
(coercions may not commute) or the order with respect to other specified
validators).

So I think that coming back to kwargs specified validators is a no go. We
will provide a set of available validators but not give kwargs-based API.

On Sep 6, 2016 12:27 PM, "Yves Delley" [email protected] wrote:

Extensible and pluggable is a good thing. But that does not mean that the
most basic functionality cannot be offered more directly, precisely because
the functionality can still be extended (i.e. you can always narrow down
the accepted values in custom validators). On the other hand, the basic set
will not conflict with custom validators passed to valid, since the custom
validators should always be applied first. Again, it's the same with all
the other standard traits.


You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
#7 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ACSXFg3e0m2RPLVjCfNdq8hizle7RQ9xks5qnUATgaJpZM4JuyhR
.

from traittypes.

burnpanck avatar burnpanck commented on June 17, 2024

I don't mind dropping the kwargs... I want to be able to access the arguments to shape(2,2), which currently reside deeply hidden in the closure of a function, that I'd have to find in a list and identify by a name. Again, for basic validation, order does not matter. If people can't agree completely what is basic, then take the common subset where most people can agree. This will certainly not be bloat. You can still use custom validators for what didn't make it into that list. And if someone does not consider a certain feature basic, then he does not have to use it.

from traittypes.

SylvainCorlay avatar SylvainCorlay commented on June 17, 2024

I don't mind dropping the kwargs... I want to be able to access the arguments to shape(2,2), which currently reside deeply hidden in the closure of a function, that I'd have to find in a list and identify by a name.

This was just an example. The custom validator holding state could be callable objects that have a nice string representation.

from traittypes.

burnpanck avatar burnpanck commented on June 17, 2024

The string-representation doesn't matter to me, it's the machine readable attributes of the object. However, that still doesn't solve:

  • How to easily find a certain standard validator in the list
  • The need to understand what is coming after in the validation sequence, in case it would modify the value. I don't care for the validator IF the value is modified afterwards (I don't see a reason to do so either). But IF NOT (i.e. the remaining validators are other, orthogonal basic validators), I would, only that I cannot decide without inspecting and knowing every following one.

Again, what speaks against offering a set of basic, unobtrusive standard validators in the spirit of the standard container traits, letting the Array trait offer more than just the .visit method in addition to Instance(ndarray)? Sorry for keeping this up so persistently, but I do sincerely believe that an Array trait that doesn't do anything more than to offer ways to customize it is not very useful.

from traittypes.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.