Comments (31)
Wouldn't it be possible to generate code to check for uninitialized references only after an "inner" constructor returns? Some checks could of course be eliminated since the compiler can reason that they must be set (like this.num
and this.den
above), but in cases where complicated logic occurs, explicit checks could be done just before the implied return of this.
Of course, the trouble with that would be when you have something like this:
type MyType
self::MyType
MyType() = init(this)
end
Here the init function gets a partially initialized MyType
object, so its accesses would need to be checked, implying that in general, all accesses would need to be checked.
But what if we required a special type for uninitialized objects? So that init
would need to be declared as init(x::Uninitialized{MyType})
instead of just init(x::MyType)
. Then the compiler knows when compiling the code for init
that it needs to generate uninitialized field checks for the x
argument, and only init
needs to pay the cost, instead of every piece of code everywhere paying the cost. It also uses the type system to prevent uninitialized objects from getting passed where they don't belong. It is very rare to actually want to pass an uninitialized object to an external function for any reason. This allows it, but requires explicit support.
The uninitialized types would always be immediate shadow parents of their initialized concrete counterparts. So the declaration that ConcreteT <: AbstractT
would actually imply a type hierarchy like this:
ConcreteT <: Uninitialized{ConcreteT} <: AbstractT
That's a little tricky, but does seem to solve the problem.
from julia.
This is clever, but I don't think we should do it. This type would have to be hacked in as a special case in various little places in the compiler and it could get ugly. It would also introduce (1) a case where the type of a value changes, currently impossible, and (2) a concrete type that has subtypes (Uninitialized is concrete because there can be a value of that type).
It would also break checks like isa(x,ConcreteT) during init(). I'm also not convinced that you have to initialize all fields within the constructor. There might be cases where it's convenient for an object to hang around for a while with uninitialized fields.
Performance is probably not a big deal, since you're testing a word you had to load anyway. Not as bad as a type check where you need an extra load. And not as bad as arrays which are more likely to involve intense looping. We're already doing the null checks there.
Maybe we can use the SIGSEGV handler to turn null accesses into exceptions!
from julia.
Ok, these are all fair points. It's a cute idea though. It does feel a little too clever, admittedly — and a little too much like how Ruby handles the object specific parent class, which is confusing but clever. And it's a good point about testing a word that's already loaded not being likely to be all that expensive.
I like the idea of using a SIGSEGV
handler to trap null exceptions. If all fields default to all zero bits, that would make bits types default to zero, which seems sensible, and reference fields would default to NULL
, which would allow us to do the SIGSEGV
trapping business.
from julia.
Oh darn, I forgot: using SEGV would not give us the behavior we have now, which is to throw an error as soon as a null reference arises, not when it is used. For example, assuming o.f is uninitialized:
a = 2
a = o.f # currently we'd give an error here
...
a.b # with SEGV behavior we'd get an error here
In general, you'd get the error at some random point in the future. Probably not ideal. On the other hand, when accessing an uninitialized element of an unboxed array you don't get any error, so perhaps this is "consistent".
from julia.
If we're going to allow uninitialized references to hang out as long as one feels like letting them hang out, then this strikes me as the most consistent behavior, honestly. It kind of sucks though because then you can have uninitialized values potentially propagating all over the place from a single accidentally uninitialized field that gets used, but I feel like that's kind of the corollary of allowing them to persist beyond object construction at all.
This is one of those decisions that it's really hard to predict which choice will in retrospect have been the right one. It feels very much like we could easily go one way and later really regret it as it becomes clear that the choice was either too lax or too strict. If we go lax, we can always get stricter, but we'd end up potentially breaking people's code. If we go strict, we can always relax things later without breaking anything. But implementing strict behavior is probably harder.
from julia.
It's a tough one. But the lax behavior is the one that everybody has done before, and that the designers of Algol said they regret IIRC. A good rule of thumb is to raise errors as soon as possible (i.e. closer to the root cause). I just tried a cell array access microbenchmark, and the null checks seem to cost about 2%.
from julia.
Honestly, 2% is not bad. Nobody really cares if something is 2% slower (except in benchmarks competitions). These days, no one cares that Hadoop is absurdly inefficient, because it can scale, which is far more important. So maybe we should go strict and fix the error that the Algol folks feel that they made.
Then the question is becomes how strict to be. Is letting null references exist but checking for them on usage and throwing an exception the right trade off? Can you come up with an example of a case where letting an object be constructed with uninitialized references is really handy?
If we allow uninitialized references to persist beyond construction time, we will need to provide intrinsics to check if a field is initialized or not, if for no other reason, than that we would need that in order to display objects with uninitialized references without barfing. If there are a lot of other situations where checks like that would have to be made, it would seem like a fairly compelling argument for not allowing uninitialized fields to escape constructors. Think about how much of a mistake allowing the null
value in Java is — it causes no end of corner cases and headaches. We do not want to find ourselves in that position when null
is replaced with "uninitialized".
from julia.
This also interacts with the immutable issue: obviously you never want to construct an immutable object with uninitialized fields, because then it will be permanently uninitialized.
One idea would be to disallow the overriding of default inner constructor for immutable types — then you can only construct immutable objects by giving all field values to the default inner constructor. On the other hand, if inner constructors were never allowed to be return objects with uninitialized fields, then you could safely allow the definition of non-default inner constructors for immutable objects. However, that implies that the this
object is mutable and then becomes immutable after the inner constructor returns. That's very weird and seems to create both a mutable and immutable version of every immutable type. I guess I prefer the former approach — disallow inner constructors for immutable types.
from julia.
Actually, looking at the Rational
example above, this is more of an issue than I originally realized since Rational
is a perfect candidate for immutability. So what is the type of this
in the inner constructor? If it's Rational
then it should be immutable, but it's clearly mutable.
from julia.
Exactly; the only way to get rid of "null check hell" is to raise an error as soon as a null reference is accessed.
Checking at the end of a constructor is neither here nor there since you can still call functions during the constructor.
I have thought about the ability to check whether a reference is uninitialized, and I have hesitated. A good example is HashTable, which uses an Array to store the keys and values. Any value can be a hash key, so unused spots must be uninitialized. I use a bitvector to keep track of which spots are used. Sure it would be more efficient to have isassigned(a, i). But then whether something is uninitialized becomes part of your program's behavior, and the next thing you want is the ability to clear a reference back to uninitialized. I guess that's kind of a hybrid approach where you still need null checks, but when you don't do null checks errors are thrown sooner.
What I'm shooting for is to have no null pointers, period. In fact we currently have this with our struct types. But this comes at a cost of too little flexibility in constructing objects. My new proposed contract with the user is that there still may not be any null pointers, but you're on your own to make that happen since the language can no longer guarantee it. From that perspective it might make sense to check at the end of a constructor, but there is no analogous thing you can do for Arrays.
I don't care what happens when printing objects. Printing something as null is kind of dishonest because you can't do anything with that knowledge; accessing the element is still an error. The contract is that an uninitialized object is not allowed to exist, whatever happens is your own fault, but we only check at well-defined points. This is like how you can have prohibited appliances in freshman dorm rooms at Harvard as long as you hide them during vacations when room inspections take place.
from julia.
For immutable objects a.x = y
means to make a copy of a
with a different value for field x
. I guess that copy would have to be a special builtin operation that allows null references. So during the Rational
constructor you might actually make 3 Rational
objects, but in practice this would be optimized away.
from julia.
My objection to not being able to print or otherwise check for uninitialized fields is that without that ability, you'll have to completely guess where you have an initialized value. If they exist and you can't check for them, then you're really just sticking your head in the sand and pretending they're not there. And you can always just trap UndefRefErrors and then do something else in the catch block. How is that different from having an isassigned
predicate? It's effectively the same but less efficient. Basically, if accessing an undefined field is an error, it does affect the program's behavior, regardless.
from julia.
I actually really dislike having a.x = y
create a new object and replace a
with it. I'm not sure what the benefit of that is. It seems to me that it would be much better to just disallow changing the fields of immutable objects altogether.
from julia.
It's true that you can implement isassigned
by catching the exception. That's probably enough to justify the predicate, so I will add it. On another level, I feel like you ought to know when things have been initialized. For example, in matlab people use exist('a')
to see if they have set one of their local variables yet, and that always seemed really stupid to me.
A construct for "changing" a field of an immutable object is pretty common; even haskell has it. I feel like people will want to do things like assign the fields of a complex number, and do a[i].im=0
. Otherwise you have to write a[i] = complex(a[i].re, 0)
. I don't much care myself but I think users will want this. We also need it for constructors; otherwise constructors for immutable types will have to use a different mechanism.
from julia.
I'm having trouble with the constructor for RopeString
. It uses unbounded mutual recursion among multiple constructor definitions. As far as I can tell the only way to do that with this change is to have the inner RopeString
constructor do what new
did, and do the recursion among outer generic function definitions for RopeString
. But then we have lost abstraction, as the constructor that accepts all fields has become publicly visible.
The problem stems from the automatic constructing and returning of this
. To be able to do what RopeString
does we need to put constructing back under user control. We could make new
take no arguments and return an uninitialized object. ?
from julia.
Another possibility: reassign this
and hope the unused constructed object is optimized away:
RopeString(h::RopeString, t::RopeString) =
depth(h.tail) + depth(t) < depth(h.head) ?
this = RopeString(h.head, RopeString(h.tail, t)) :
(this.head = h;
this.tail = t;
...)
Or a keyword argument?
RopeString(h.head, RopeString(h.tail, t), this=this)
though that one is a little leaky.
from julia.
I'm suspicious of hoping that the compiler is going to eliminate the creation of temporary objects. Even if it does manage to do so, imo, it puts too much distance between what's conceptually going on and what is actually going on.
If a constructor can call another constructor to do its work, then it seems to me like these are the options:
- don't create a
this
object to be implicitly passed into the constructor and provide constructors with a way of creating uninitialized objects, e.g. anew
thunk - explicitly pass
this
into constructors, thereby allowing one constructor to pass the buck to another constructor by passing itsthis
instance to it - automatically share
this
between recursive constructor calls so that the recursive call is operating on the same this as the original call
I'm not really sure which is best :-|
from julia.
Here's a weird one... have this.x=y
in a constructor be special syntax that first checks whether this
is assigned, and if not assigns it to a new instance. So nothing is allocated until you touch this
. Clever because type inference is already able to remove undefined-var checks, but not able to remove object allocations. A bit strange but I'm starting to like this idea.
from julia.
Can you explain this bit:
Clever because type inference is already able to remove undefined-var checks, but not able to remove object allocations.
Other than that it seems reasonable enough, but I'm a bit hesitant to add magical behaviors...
from julia.
this.x=y
would be converted to this?=new(); this.x=y
where ?=
is a mythical operator that only evaluates the expression and does the assignment if this
has not been set yet. Type inference can prove when a variable has been set, so most of the ?=
expressions can be removed. For example:
if foo()
this.x = 0
end
this.y = 1
this.z = 2
this.w = 3
The first one can use this=new()
, because we know this
has not been set yet. Only the second one requires a check, and the rest need no checks.
So we can have recursive constructors with no redundant allocations and allow uninitialized fields with very little overhead.
As per your recent email on hash tables, ?=
could even become a full-fledged language feature. We could support it for any assignable thing. People might even like it for local variables, because sometimes you have complex logic that might end up setting some variable, and then later want something like "set it to this other thing if it hasn't been set yet". I've seen this in matlab code. ?=
would let you avoid an extra flag. Since we already go to the trouble of checking for undefined variables, we might as well expose this capability to the user.
from julia.
So this is the basic plan:
- allow undefined values
- allow checking for undefined values
- allow conditional setting of undefined values via
?=
- throw an exception immediately on any use of an undefined value
This seems like a pragmatic combination. The last point is really important because it prevents "null check hell" and prevents undefined values from silently propagating through a program, thereby also avoiding many null checks, since anything that's been used cannot possibly be null.
from julia.
A couple weeks ago Stephan B. observed that you might have a constructor like:
MyType() = return this
So we actually need to convert every occurrence of this
to (this?=new(...); this)
. Might seem extreme, but perfectly doable.
Is the design settled? It seems a bit crazy but I feel like we really want recursive constructors. It's a clever feature. In other languages you have to do things like Class::Create(...)
when you need this ability.
from julia.
This whole thing still feels fishy to me, but I don't have anything better :-(
from julia.
You know, I'm thinking that having a new()
function that creates an uninitialized object feels better than this. It feels really wrong the way this
is special here. Providing a new()
function is also is the least magical solution, which is nice — it's perfectly understandable and explainable. We could allow optional arguments to new()
which could be used to assign fields if they're given. That's frequently a lot clearer and more concise.
from julia.
Using this approach, the default constructor would just be this:
type Foo
bar
baz
Foo(bar,baz) = new(bar,baz)
end
This would be provided by default, of course, so the above definition would be unnecessary. Here are various examples to try this on and see how it feels...
type MyType
self::MyType
MyType() = (this = new(); this.self = this)
end
A little verbose but pretty clear.
type Rational{T<:Int} <: Real
num::T
den::T
Rational(n::Int, d::Int) = (g = gcd(n,d); new(div(n,g),div(d,g)))
end
Rational{T}(n::T, d::T) = Rational{T}(n,d)
Rational(n::Int, d::Int) = Rational(promote(n,d)...)
Rational(n::Int) = Rational(n,one(n))
Clean, clear, short. I like it.
function RopeString(h::RopeString, t::RopeString) =
depth(h.tail) + depth(t) < depth(h.head) ?
RopeString(h.head, RopeString(h.tail, t)) :
new(h, t, max(h.depth, t.depth)+1, length(h)+length(t))
Also simple, clear and concise.
from julia.
This would also jive well with immutable objects — you create them in their final form using new()
and then can't change them. This would mean that immutable objects cannot be self-referential, which is actually rather nice.
from julia.
It's a good point that new()
is actually pretty nice and concise when it works. And it's perfect for immutable objects---just a matter of making the arguments to new
required instead of optional. Elegant. Yes, this is the better solution.
My only remaining problem is that I want to rename the clone
function to new
. clone
is a terrible name, since it doesn't clone but makes new uninitialized objects. Can we think of another name, or some other way to get around this conflict?
from julia.
For posterity, it should be noted that half of this discussion took place in this email thread.
from julia.
My only remaining problem is that I want to rename the clone function to new. clone is a terrible name, since it doesn't clone but makes new uninitialized objects. Can we think of another name, or some other way to get around this conflict?
I agree that clone
is a lousy name for this but I don't really think new
is much better for that purpose — and new
is clearly a good name for the function that makes new objects. I always do a double-take and have to remember what clone
actually does every time I see it. I've always felt that it was weird and slightly fishy that arrays have their own special construction function. It feels like it's begging for a cleaner and/or more general solution. How about calling it make_new
or make
or new_like
or something like that?
from julia.
It's not supposed to be a special constructor for arrays. It's part of the mutable container interface. It lets you write generic container manipulations like
function replace(c, a, b)
n = make_new(c)
for k = keys(c)
n[k] = (c[k]==a ? b : c[k])
end
n
end
How about create
or alloc
?
from julia.
branch merged in commit 587f36b. fixed.
from julia.
Related Issues (20)
- Reading serialized primitive types on a different machine fails HOT 1
- backtrace test failing on windows
- Autoinstall prompt should appear in more cases HOT 1
- Some functions will show undefined HOT 1
- `wrap` is a very generic name to export HOT 1
- `Date` arithmetic doesn't handle leap years properly HOT 1
- implicit return lines reported active by coverage but not tracked in julia 1.10.2 HOT 4
- CSV doesn't precompile in 1.11-alpha1 and nightly HOT 1
- Extra newline in docstrings
- Document `op` argument order in `foldl`
- "character literal contains multiple characters" error message HOT 1
- Can't sum partially initialized Hermitian matrices that are not isbits with Hermitian Diagonal matrices HOT 3
- Convenient ways to define == and hash HOT 2
- Both v1.10.2 & v1.11.0.alpha Windows installation have a extra > 200 Mb not needed file
- v1.11 Takes longer to compile, generates 50% larger cache files and loads slower
- Nightly hanging test env precompilation with `JULIA_NUM_THREADS > 1`
- `parentmodule(Base) == Main`
- MethodError hint offers no useful information in common case HOT 1
- `which` with names of modules behaves differently in `Main`
- `Sys.cpu_info()` documentation says time units are CLK_TCK, should be milliseconds
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from julia.